diff --git a/.gitignore b/.gitignore index bb42d2a49..ce253a19c 100644 --- a/.gitignore +++ b/.gitignore @@ -10,3 +10,4 @@ tags tegra/ *.i .download* +/Debug/ diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 000000000..d4200db2b --- /dev/null +++ b/.gitmodules @@ -0,0 +1,12 @@ +[submodule "modules/v4d/third/imgui"] + path = modules/v4d/third/imgui + url = https://github.com/kallaballa/imgui.git +[submodule "modules/v4d/third/doxygen-bootstrapped"] + path = modules/v4d/third/doxygen-bootstrapped + url = https://github.com/kallaballa/doxygen-bootstrapped.git +[submodule "modules/v4d/third/nanovg"] + path = modules/v4d/third/nanovg + url = https://github.com/kallaballa/nanovg +[submodule "bgfx.cmake"] + path = modules/v4d/third/bgfx.cmake + url = https://github.com/bkaradzic/bgfx.cmake.git diff --git a/README.md b/README.md index 8d3ecda04..66ae0ae88 100644 --- a/README.md +++ b/README.md @@ -1,60 +1,85 @@ -## Repository for OpenCV's extra modules +## Introduction to "Plan" and "V4D" -This repository is intended for the development of so-called "extra" modules, -contributed functionality. New modules quite often do not have stable API, -and they are not well-tested. Thus, they shouldn't be released as a part of the -official OpenCV distribution, since the library maintains binary compatibility, -and tries to provide decent performance and stability. +### Overview of "Plan" +**Plan** is a computational graph engine built with C++20 templates, enabling developers to construct directed acyclic graphs (DAGs) from fragments of algorithms. By leveraging these graphs, Plan facilitates the optimization of parallel and concurrent algorithms, ensuring efficient resource utilization. The framework divides the lifetime of an algorithm into two distinct phases: **inference** and **execution**. -So, all the new modules should be developed separately, and published in the -`opencv_contrib` repository at first. Later, when the module matures and gains -popularity, it is moved to the central OpenCV repository, and the development team -provides production-quality support for this module. +- **Inference Phase:** During this phase, the computational graph is constructed by running the Plan implementation. This process organizes the algorithm's fragments and binds them to data, which may be classified as: + - **Safe Data:** Member variables of the Plan. + - **Shared Data:** External variables (e.g., global or static data). + + Functions and data are explicitly flagged as shared when necessary, adhering to Plan’s transparent approach to state management. The framework discourages hidden states, as they impede program integrity and graph optimization. -### How to build OpenCV with extra modules +- **Execution Phase:** This phase executes the constructed graph using the defined nodes and edges. Nodes typically represent algorithmic fragments such as functions or lambdas, while edges define data flow, supporting various access patterns (e.g., read, write, copy). -You can build OpenCV, so it will include the modules from this repository. Contrib modules are under constant development and it is recommended to use them alongside the master branch or latest releases of OpenCV. +Plan also allows hierarchical composition, where one Plan may be composed of other sub-Plans. Special rules govern data sharing in such compositions to maintain performance and correctness. Currently, optimizations are limited to “best-effort” pipelining, with plans for more sophisticated enhancements. -Here is the CMake command for you: +### Overview of "V4D" +**V4D** is a versatile 2D/3D graphics runtime designed to integrate seamlessly with Plan. Built atop OpenGL (3.0 or ES 3.2), V4D extends its functionality through bindings to prominent libraries: +- **NanoVG:** For 2D vector and raster graphics, including font rendering. +- **bgfx:** A 3D engine modified to defer its concurrency model to Plan for optimal parallel execution. +- **IMGui:** A lightweight GUI overlay. -``` -$ cd -$ cmake -DOPENCV_EXTRA_MODULES_PATH=/modules -$ make -j5 -``` +V4D encourages direct OpenGL usage and external API integrations via **context sharing**, which is implemented using shared textures. Each external API operates within its isolated OpenGL state machine, maintaining thread safety and modularity. -As the result, OpenCV will be built in the `` with all -modules from `opencv_contrib` repository. If you don't want all of the modules, -use CMake's `BUILD_opencv_*` options. Like in this example: +The runtime’s capabilities are further augmented by its integration with OpenCV, providing: +- **Hardware Acceleration:** Utilizing OpenGL for graphics, VAAPI and NVENC for video, and OpenCL-OpenGL interop for compute tasks. +- **Data Sharing on GPU:** Depending on hardware and software features, V4D can directly share or copy data within GPU memory for efficient processing. -``` -$ cmake -DOPENCV_EXTRA_MODULES_PATH=/modules -DBUILD_opencv_legacy=OFF -``` +### Integration and Platform Support +V4D and Plan share a tightly bonded design, simplifying combined use cases. However, plans are underway to decouple them, enabling the adoption of alternative runtimes. V4D is actively developed for Linux (X11 and Wayland via EGL or GLX), with auto-detection of supported backends. While macOS support lags slightly, Windows compatibility remains untested but is considered during development. -If you also want to build the samples from the "samples" folder of each module, also include the "-DBUILD_EXAMPLES=ON" option. +### Key Principles and Features +1. **Fine-Grained Edge Calls:** Plan introduces specialized edge calls (e.g., `R`, `RW`, `V`) to define data access patterns, supporting smart pointers and OpenCV `UMat` objects. This granularity allows better graph optimization. +2. **State and Data Transparency:** Functions and data in a Plan must avoid introducing hidden states unless explicitly marked as shared. This principle ensures the integrity of the graph and its optimizations. +3. **Parallelism and Pipelining:** Multiple OpenGL contexts can be created and utilized in parallel, making V4D a robust solution for high-performance graphics applications. +4. **Algorithm Modularity:** By structuring algorithms into smaller, reusable fragments or sub-Plans, Plan fosters modular development and scalability. -If you prefer using the GUI version of CMake (cmake-gui), then, you can add `opencv_contrib` modules within `opencv` core by doing the following: +### Selected Commented Examples (read sequentially) +The following examples have been selected to deepen your understanding of Plan-V4D. There are many more. -1. Start cmake-gui. +#### Blue Sreen using OpenGL +[source](modules/v4d/samples/render_opengl.cpp) -2. Select the opencv source code folder and the folder where binaries will be built (the 2 upper forms of the interface). +#### Displaying an Image using NanoVG +[source](modules/v4d/samples/display_image_nvg.cpp) -3. Press the `configure` button. You will see all the opencv build parameters in the central interface. +#### A realtime beauty filter (using sub-plans) +[source](modules/v4d/samples/beauty-demo.cpp) -4. Browse the parameters and look for the form called `OPENCV_EXTRA_MODULES_PATH` (use the search form to focus rapidly on it). +## Why Plan-V4D? -5. Complete this `OPENCV_EXTRA_MODULES_PATH` by the proper pathname to the `/modules` value using its browse button. +* Computation Graph Engine: Fast parallel code. +* OpenGL: Easy access to OpenGL. +* GUI: Simple yet powerful user interfaces through ImGui. +* Vector graphics: Elegant and fast vector graphics through NanoVG. +* 3D graphics: Powerful 3D graphics through bgfx. +* Font rendering: Loading of fonts and sophisticated rendering options. +* Video pipeline: Through a simple source/sink system videos can be efficently read, displayed, edited and saved. +* Hardware acceleration: Transparent hardware acceleration usage where possible. (e.g. OpenGL, OpenCL, CL-GL interop, VAAPI and CL-VAAPI interop, nvenc). Actually it is possible to write programs that +* No more highgui with it's heavy dependencies, licenses and limitations. -6. Press the `configure` button followed by the `generate` button (the first time, you will be asked which makefile style to use). +Please refer to the examples and demos as well as [this OpenCV issue](https://github.com/opencv/opencv/issues/22923) to find out exactly what it can do for you. -7. Build the `opencv` core with the method you chose (make and make install if you chose Unix makefile at step 6). +## GPU Support +* Intel Gen 8+ (Tested: Gen 11 + Gen 13) tested +* NVIDIA Ada Lovelace (Tested: GTX 4070 Ti) with proprietary drivers (535.104.05) and CUDA toolkit (12.2) tested. +* Intel Arc770 (Mesa 24.3.1) tested +* AMD: never tested -8. To run, linker flags to contrib modules will need to be added to use them in your code/IDE. For example to use the aruco module, "-lopencv_aruco" flag will be added. +## Requirements +* C++20 (at the moment) +* OpenGL 3.2 Core (optionally Compat)/OpenGL ES 3.0/WebGL2 -### Update the repository documentation +## Optional requirements +* Support for OpenCL 1.2 +* Support for cl_khr_gl_sharing and cl_intel_va_api_media_sharing OpenCL extensions. -In order to keep a clean overview containing all contributed modules, the following files need to be created/adapted: +## Dependencies +* My OpenCV 4.x fork (It works with mainline OpenCV 4.x as well, but using my fork is highly recommended because it features several improvements and fixes) +* GLEW +* GLFW3 +* NanoVG (included as a sub-repo) +* ImGui (included as a sub-repo) +* bgfx (included as a sub-repo) +* Glad (included) -1. Update the README.md file under the modules folder. Here, you add your model with a single-line description. - -2. Add a README.md inside your own module folder. This README explains which functionality (separate functions) is available, links to the corresponding samples, and explains in somewhat more detail what the module is expected to do. If any extra requirements are needed to build the module without problems, add them here also. diff --git a/debug-env.sh b/debug-env.sh new file mode 100644 index 000000000..87cf6fce9 --- /dev/null +++ b/debug-env.sh @@ -0,0 +1,24 @@ +#OPENCV_LOC=/home/elchaschab/devel/opencv/ +#FFMPEG_LOC=/home/elchaschab/devel/cartwheel-ffmpeg/ffmpeg/ +#export LD_LIBRARY_PATH="$OPENCV_LOC/build/lib/:$FFMPEG_LOC/libavcodec/:$FFMPEG_LOC/libavutil/:$FFMPEG_LOC/libavdevice/:$FFMPEG_LOC/libavformat/:$FFMPEG_LOC/libavfilter/:$FFMPEG_LOC/libpostproc/:$FFMPEG_LOC/libswresample/:$FFMPEG_LOC/libswscale/:$LD_LIBRARY_PATH" + +export OPENCV_LOG_LEVEL=VERBOSE +export OPENCV_FFMPEG_LOGLEVEL=56 +export OPENCV_VIDEOIO_DEBUG=1 +export OPENCV_VIDEOWRITER_DEBUG=1 +export OPENCV_VIDEOCAPTURE_DEBUG=1 +export OPENCV_FFMPEG_DEBUG=1 +export OPENCV_OPENCL_RAISE_ERROR=1 +export OPENCV_OPENCL_ABORT_ON_BUILD_ERROR=1 +export OPENCV_DUMP_ERRORS=1 +export OPENCV_DUMP_CONFIG=1 +export OPENCV_TRACE=1 +export OPENCV_TRACE_DEPTH_OPENCV=1 +export OPENCV_TRACE_SYNC_OPENCL=1 +#export OPENCV_CPU_DISABLE= +#export OPENCV_OPENCL_ENABLE_MEM_USE_HOST_PTR=1 +#export OPENCV_OPENCL_ALIGNMENT_MEM_USE_HOST_PTR=1 +#export OPENCV_OPENCL_RUNTIME= +#export OPENCV_OPENCL_DEVICE= +#export OPENCV_OPENCL_SVM_DISABLE=1 + diff --git a/modules/v4d/.gitignore b/modules/v4d/.gitignore new file mode 100644 index 000000000..1f085eae8 --- /dev/null +++ b/modules/v4d/.gitignore @@ -0,0 +1,13 @@ +.project +build/ +samples/*/*.dep +samples/*/*.o +samples/beauty/beauty-demo +samples/font/font-demo +samples/nanovg/nanovg-demo +samples/optflow/optflow-demo +samples/pedestrian/pedestrian-demo +samples/shader/shader-demo +samples/tetra/tetra-demo +samples/video/video-demo + diff --git a/modules/v4d/CMakeLists.txt b/modules/v4d/CMakeLists.txt new file mode 100755 index 000000000..d8b853109 --- /dev/null +++ b/modules/v4d/CMakeLists.txt @@ -0,0 +1,229 @@ +cmake_policy(SET CMP0079 NEW) + +OCV_OPTION(OPENCV_V4D_ENABLE_ES3 "Enable OpenGL ES 3.0 backend for V4D" OFF + VERIFY HAVE_OPENGL) + +include(FetchContent) + +if(NOT EMSCRIPTEN) + find_package(glfw3 3 REQUIRED) + find_package(OpenCL REQUIRED) + find_package(GLEW REQUIRED) + include("FindOpenGL") +endif() + +set(the_description "V4D Visualization Module") +set(OPENCV_MODULE_IS_PART_OF_WORLD OFF) + +# Check CXX Features +get_property(known_features GLOBAL PROPERTY CMAKE_CXX_KNOWN_FEATURES) +list (FIND known_features "cxx_std_20" idx) +if (${idx} LESS 0) + message(STATUS "Module opencv_v4d disabled because it requires C++20") + ocv_module_disable(v4d) +endif() + +# Update submodules +find_package(Git QUIET) +if(GIT_FOUND AND EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/../../.git") +# Update submodules as needed + message(STATUS "Submodule update") + execute_process(COMMAND ${GIT_EXECUTABLE} submodule update --init --recursive + WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/../../" + RESULT_VARIABLE GIT_SUBMOD_RESULT) + if(NOT GIT_SUBMOD_RESULT EQUAL "0") + message(FATAL_ERROR "git submodule update --init --recursive failed with ${GIT_SUBMOD_RESULT}, please checkout submodules") + endif() +endif() + +if(NOT EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui/") + message(FATAL_ERROR "The submodules were not downloaded! GIT_SUBMODULE was turned off or failed. Please update submodules and try again.") +endif() + +# Macro to download a file +macro(fetch_file download_name url hash) + FetchContent_Declare(${download_name} + URL ${url} + URL_HASH SHA256=${hash} + DOWNLOAD_NO_EXTRACT true + TLS_VERIFY true + ) + + FetchContent_MakeAvailable(${download_name}) +endmacro(fetch_file) + +# Macro to add a native sample +macro(add_binary_sample sample source) + if(NOT (TARGET ${sample})) + ocv_add_executable(${sample} ${source}) + endif() + ocv_target_link_libraries(${sample} OpenGL GLEW glfw X11 nanovg bgfx) + target_compile_features(${sample} PRIVATE cxx_std_20) + # set_property(TARGET ${sample} PROPERTY POSITION_INDEPENDENT_CODE ON) + target_link_directories(${sample} PRIVATE "${CMAKE_CURRENT_BINARY_DIR}/../../lib") + target_include_directories(${sample} PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}/include/" "${CMAKE_CURRENT_SOURCE_DIR}/third/glad/include" "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui" "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui/backends/" "${CMAKE_CURRENT_SOURCE_DIR}/third/nanovg/src/" "${CMAKE_CURRENT_SOURCE_DIR}/third/bgfx.cmake/bgfx/include/" "${CMAKE_CURRENT_SOURCE_DIR}/third/bgfx.cmake/bx/include/" "${CMAKE_CURRENT_SOURCE_DIR}/third/bgfx.cmake/bimg/include/") +endmacro() + # set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address") + # set(CMAKE_LD_FLAGS "${CMAKE_LqD_FLAGS} -fsanitize=address -static-libasan") + + # set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=undefined") + # set(CMAKE_LD_FLAGS "${CMAKE_LD_FLAGS} -fsanitize=undefined -static-libasan") + + # set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=thread") + # set(CMAKE_LD_FLAGS "${CMAKE_LD_FLAGS} -fsanitize=thread -static-libasan") + # set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Werror -Wno-sign-promo") + + + +if (NOT (TARGET nanovg)) + #Configure NanoVG build options + if(OPENCV_V4D_ENABLE_ES3) + add_definitions(-DNANOVG_GLES3=1 ) + else() + add_definitions(-DNANOVG_GL3=1 ) + endif() + + add_subdirectory("${CMAKE_CURRENT_SOURCE_DIR}/third/nanovg/") + target_compile_options(nanovg PUBLIC -Wno-error) + target_compile_options(nanovg PUBLIC -pthread) + + # # target_include_directories(nanovg PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}/third/nanovg/src/") + # # include_directories("${CMAKE_CURRENT_SOURCE_DIR}/third/nanovg/src/") + # if(OPENCV_V4D_ENABLE_ES3) + # target_link_libraries(nanovg OpenGL::GLES3) + # else() + # target_link_libraries(nanovg OpenGL::OpenGL) + # endif() + # target_compile_features(nanovg PRIVATE cxx_std_20) + + install(TARGETS nanovg EXPORT OpenCVModules) + endif() + + if (NOT (TARGET bgfx)) + set(BGFX_BUILD_EXAMPLES OFF) + set(BGFX_LIBRARY_TYPE "SHARED") + set(BGFX_INSTALL OFF) + + if(OPENCV_V4D_ENABLE_ES3) + set(BGFX_OPENGLES_VERSION "30") + else() + set(BGFX_OPENGL_VERSION "32") + endif() + #-DBGFX_CONFIG_MULTITHREADED=0 + add_definitions(-DBGFX_CONFIG_PROFILER=0 -DBGFX_CONFIG_PASSIVE=1) + add_subdirectory("${CMAKE_CURRENT_SOURCE_DIR}/third/bgfx.cmake") + + + target_compile_features(bgfx PRIVATE cxx_std_20) + target_compile_options(bgfx PUBLIC -Wno-error) + target_include_directories(bgfx PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}/third/glad/include") + # target_link_libraries(bgfx PUBLIC glfw) + install(TARGETS bgfx EXPORT OpenCVModules) + install(TARGETS bimg EXPORT OpenCVModules) + install(TARGETS bx EXPORT OpenCVModules) + endif() +# Add the opencv module +if(NOT (TARGET ${the_module})) + ocv_add_module(v4d opencv_core opencv_imgproc opencv_videoio opencv_video) + file(GLOB imgui_sources CONFIGURE_DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui/*.cpp" "${CMAKE_CURRENT_SOURCE_DIR}/src/detail/imguicontext.cpp") + file(GLOB imgui_backend_sources CONFIGURE_DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui/backends/imgui_impl_opengl3*.cpp") + file(GLOB imgui_glfw_sources CONFIGURE_DEPENDS "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui/backends/imgui_impl_glfw.cpp") + ocv_glob_module_sources("${CMAKE_CURRENT_SOURCE_DIR}/src" "${CMAKE_CURRENT_SOURCE_DIR}/src/detail/" ${imgui_sources} ${imgui_backend_sources} ${imgui_glfw_sources}) + ocv_module_include_directories("${CMAKE_CURRENT_SOURCE_DIR}/include/" "${CMAKE_CURRENT_SOURCE_DIR}/third/glad/include" "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui" "${CMAKE_CURRENT_SOURCE_DIR}/third/imgui/backends/" "${CMAKE_CURRENT_SOURCE_DIR}/third/nanovg/src/" "${CMAKE_CURRENT_SOURCE_DIR}/third/bgfx.cmake/bgfx/include/" "${CMAKE_CURRENT_SOURCE_DIR}/third/bgfx.cmake/bx/include/" "${CMAKE_CURRENT_SOURCE_DIR}/third/bgfx.cmake/bimg/include/") + ocv_create_module() + set_target_properties(${the_module} PROPERTIES LINKER_LANGUAGE CXX) + + ocv_add_samples(opencv_v4d opencv_core opencv_imgproc opencv_videoio opencv_video opencv_imgcodecs opencv_face opencv_tracking opencv_objdetect opencv_stitching opencv_optflow opencv_imgcodecs opencv_features2d opencv_dnn opencv_flann) + # Populate assets + fetch_file("LBFMODEL" "https://github.com/kurnianggoro/GSOC2017/raw/master/data/lbfmodel.yaml" "70dd8b1657c42d1595d6bd13d97d932877b3bed54a95d3c4733a0f740d1fd66b") + + fetch_file("YUNET" "https://github.com/opencv/opencv_zoo/raw/main/models/face_detection_yunet/face_detection_yunet_2023mar.onnx" "8f2383e4dd3cfbb4553ea8718107fc0423210dc964f9f4280604804ed2552fa4") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E make_directory + "${CMAKE_CURRENT_BINARY_DIR}/assets") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E make_directory + "${CMAKE_CURRENT_BINARY_DIR}/assets/doxygen") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E make_directory + "${CMAKE_CURRENT_BINARY_DIR}/assets/models") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E make_directory + "${CMAKE_CURRENT_BINARY_DIR}/assets/fonts") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E copy + "${CMAKE_CURRENT_SOURCE_DIR}/samples/fonts/*.ttf" + "${CMAKE_CURRENT_BINARY_DIR}/assets/fonts/") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E copy + "${CMAKE_CURRENT_LIST_DIR}/doc/lena.png" + "${CMAKE_CURRENT_BINARY_DIR}/doc/lena.png") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E copy + "${lbfmodel_SOURCE_DIR}/lbfmodel.yaml" + "${CMAKE_CURRENT_BINARY_DIR}/assets/models/") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E copy + "${yunet_SOURCE_DIR}/face_detection_yunet_2023mar.onnx" + "${CMAKE_CURRENT_BINARY_DIR}/assets/models/") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E copy + "${CMAKE_CURRENT_SOURCE_DIR}/third/doxygen-bootstrapped/customdoxygen.css" + "${CMAKE_SOURCE_DIR}/doc/stylesheet.css") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E copy + "${CMAKE_CURRENT_SOURCE_DIR}/third/doxygen-bootstrapped/example-site/header.html" + "${CMAKE_SOURCE_DIR}/doc/") + + add_custom_command(TARGET ${the_module} PRE_BUILD + COMMAND ${CMAKE_COMMAND} -E copy + "${CMAKE_CURRENT_SOURCE_DIR}/third/doxygen-bootstrapped/example-site/footer.html" + "${CMAKE_SOURCE_DIR}/doc/") + + list(APPEND CMAKE_DOXYGEN_HTML_FILES "${CMAKE_CURRENT_SOURCE_DIR}/third/doxygen-bootstrapped/doxy-boot.js") + + #Add sample targets + if(BUILD_EXAMPLES) + add_binary_sample(example_v4d_display_image_fb samples/display_image_fb.cpp) + add_binary_sample(example_v4d_display_image_nvg samples/display_image_nvg.cpp) + add_binary_sample(example_v4d_vector_graphics samples/vector_graphics.cpp) + add_binary_sample(example_v4d_vector_graphics_and_fb samples/vector_graphics_and_fb.cpp) + add_binary_sample(example_v4d_render_opengl samples/render_opengl.cpp) + add_binary_sample(example_v4d_custom_source_and_sink samples/custom_source_and_sink.cpp) + add_binary_sample(example_v4d_font_rendering samples/font_rendering.cpp) + add_binary_sample(example_v4d_font_with_gui samples/font_with_gui.cpp) + add_binary_sample(example_v4d_video_editing samples/video_editing.cpp) + add_binary_sample(example_v4d_cube-demo samples/cube-demo.cpp) + add_binary_sample(example_v4d_many_cubes-demo samples/many_cubes-demo.cpp) + add_binary_sample(example_v4d_video-demo samples/video-demo.cpp) + add_binary_sample(example_v4d_nanovg-demo samples/nanovg-demo.cpp) + add_binary_sample(example_v4d_font-demo samples/font-demo.cpp) + add_binary_sample(example_v4d_shader-demo samples/shader-demo.cpp) + add_binary_sample(example_v4d_pedestrian-demo samples/pedestrian-demo.cpp) + add_binary_sample(example_v4d_optflow-demo samples/optflow-demo.cpp) + add_binary_sample(example_v4d_beauty-demo samples/beauty-demo.cpp) + add_binary_sample(example_v4d_bgfx-demo samples/bgfx-demo.cpp) + add_binary_sample(example_v4d_bgfx-demo2 samples/bgfx-demo2.cpp) + add_binary_sample(example_v4d_montage-demo samples/montage-demo.cpp) + endif() + + if(OPENCV_V4D_ENABLE_ES3) + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DOPENCV_V4D_USE_ES3=1") + endif() + + + target_compile_features(${the_module} PRIVATE cxx_std_20) + ocv_warnings_disable(CMAKE_CXX_FLAGS -Wdeprecated-enum-enum-conversion) + target_link_directories(${the_module} PRIVATE "${CMAKE_CURRENT_BINARY_DIR}/../../lib") + ocv_target_link_libraries(${the_module} OpenCL OpenGL::OpenGL glfw -lnanovg -lbgfx -lbimg -lbx) +endif() diff --git a/modules/v4d/doc/custom_source_and_sink.gif b/modules/v4d/doc/custom_source_and_sink.gif new file mode 100644 index 000000000..a663752b6 Binary files /dev/null and b/modules/v4d/doc/custom_source_and_sink.gif differ diff --git a/modules/v4d/doc/display_image.png b/modules/v4d/doc/display_image.png new file mode 100644 index 000000000..ee2b2ef15 Binary files /dev/null and b/modules/v4d/doc/display_image.png differ diff --git a/modules/v4d/doc/display_image_fb.png b/modules/v4d/doc/display_image_fb.png new file mode 100644 index 000000000..dc35fcf84 Binary files /dev/null and b/modules/v4d/doc/display_image_fb.png differ diff --git a/modules/v4d/doc/font_rendering.png b/modules/v4d/doc/font_rendering.png new file mode 100644 index 000000000..8885aef4b Binary files /dev/null and b/modules/v4d/doc/font_rendering.png differ diff --git a/modules/v4d/doc/font_with_gui.png b/modules/v4d/doc/font_with_gui.png new file mode 100644 index 000000000..ea5b93197 Binary files /dev/null and b/modules/v4d/doc/font_with_gui.png differ diff --git a/modules/v4d/doc/lena.png b/modules/v4d/doc/lena.png new file mode 100644 index 000000000..59ef68aab Binary files /dev/null and b/modules/v4d/doc/lena.png differ diff --git a/modules/v4d/doc/render_opengl.png b/modules/v4d/doc/render_opengl.png new file mode 100644 index 000000000..0b548f334 Binary files /dev/null and b/modules/v4d/doc/render_opengl.png differ diff --git a/modules/v4d/doc/v4d.bib b/modules/v4d/doc/v4d.bib new file mode 100644 index 000000000..e69de29bb diff --git a/modules/v4d/doc/vector_graphics.png b/modules/v4d/doc/vector_graphics.png new file mode 100644 index 000000000..cd46af99c Binary files /dev/null and b/modules/v4d/doc/vector_graphics.png differ diff --git a/modules/v4d/doc/vector_graphics_and_fb.png b/modules/v4d/doc/vector_graphics_and_fb.png new file mode 100644 index 000000000..3a9a8ad89 Binary files /dev/null and b/modules/v4d/doc/vector_graphics_and_fb.png differ diff --git a/modules/v4d/doc/video_editing.png b/modules/v4d/doc/video_editing.png new file mode 100644 index 000000000..e8fa11690 Binary files /dev/null and b/modules/v4d/doc/video_editing.png differ diff --git a/modules/v4d/include/opencv2/v4d/detail/cl.hpp b/modules/v4d/include/opencv2/v4d/detail/cl.hpp new file mode 100644 index 000000000..3003f793b --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/cl.hpp @@ -0,0 +1,13 @@ +#ifndef MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_CL_HPP_ +#define MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_CL_HPP_ + +#ifndef CL_TARGET_OPENCL_VERSION +# define CL_TARGET_OPENCL_VERSION 120 +#endif +#ifdef __APPLE__ +# include +#else +# include +#endif + +#endif /* MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_CL_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/context.hpp b/modules/v4d/include/opencv2/v4d/detail/context.hpp new file mode 100644 index 000000000..dc6ce92c3 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/context.hpp @@ -0,0 +1,44 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include +#include "../../../../include/opencv2/v4d/util.hpp" + +#ifndef MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_V4DCONTEXT_HPP_ +#define MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_V4DCONTEXT_HPP_ + +namespace cv { +namespace v4d { +namespace detail { + +class V4DContext { +public: + virtual ~V4DContext() {} + virtual void execute(std::function fn) = 0; +}; + +class OnceContext : public V4DContext { + inline static std::once_flag flag_; +public: + virtual ~OnceContext() {} + virtual void execute(std::function fn) override { + std::call_once(flag_, fn); + } +}; + + +class PlainContext : public V4DContext { +public: + virtual ~PlainContext() {} + virtual void execute(std::function fn) override { + fn(); + } +}; + +} +} +} + +#endif /* MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_V4DCONTEXT_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/framebuffercontext.hpp b/modules/v4d/include/opencv2/v4d/detail/framebuffercontext.hpp new file mode 100644 index 000000000..9761b2350 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/framebuffercontext.hpp @@ -0,0 +1,352 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_FRAMEBUFFERCONTEXT_HPP_ +#define SRC_OPENCV_FRAMEBUFFERCONTEXT_HPP_ + +#include "cl.hpp" +#include "context.hpp" +#include +#include +#include "opencv2/v4d/util.hpp" +#include +#include +#include +#define GLFW_INCLUDE_NONE +#include +typedef unsigned int GLenum; +#define GL_FRAMEBUFFER 0x8D40 + +namespace cv { +namespace v4d { +class V4D; + +namespace detail { +#ifdef HAVE_OPENCL +typedef cv::ocl::OpenCLExecutionContext CLExecContext_t; +class CLExecScope_t +{ + CLExecContext_t ctx_; +public: + inline CLExecScope_t(const CLExecContext_t& ctx) + { + if(ctx.empty()) + return; + ctx_ = CLExecContext_t::getCurrentRef(); + ctx.bind(); + } + + inline ~CLExecScope_t() + { + if (!ctx_.empty()) + { + ctx_.bind(); + } + } +}; +#else +struct CLExecContext_t { + bool empty() { + return true; + } + static CLExecContext_t getCurrent() { + return CLExecContext_t(); + } +}; +class CLExecScope_t +{ + CLExecContext_t ctx_; +public: + inline CLExecScope_t(const CLExecContext_t& ctx) + { + } + + inline ~CLExecScope_t() + { + } +}; +#endif +/*! + * The FrameBufferContext acquires the framebuffer from OpenGL (either by up-/download or by cl-gl sharing) + */ +class CV_EXPORTS FrameBufferContext : public V4DContext { + typedef unsigned int GLuint; + typedef signed int GLint; + + friend class SourceContext; + friend class SinkContext; + friend class GLContext; + friend class NanoVGContext; + friend class ImGuiContextImpl; + friend class cv::v4d::V4D; + cv::Ptr self_ = this; + V4D* v4d_ = nullptr; + bool offscreen_; + string title_; + int major_; + int minor_; + int samples_; + bool debug_; + GLFWwindow* glfwWindow_ = nullptr; + bool clglSharing_ = true; + bool isVisible_; + GLuint onscreenTextureID_ = 0; + GLuint onscreenRenderBufferID_ = 0; + GLuint frameBufferID_ = 0; + GLuint textureID_ = 0; + GLuint renderBufferID_ = 0; + GLint viewport_[4]; + cl_mem clImage_ = nullptr; + CLExecContext_t context_; + const cv::Size framebufferSize_; + bool hasParent_ = false; + GLFWwindow* rootWindow_; + cv::Ptr parent_; + bool isRoot_ = true; + + //data and handles for webgl copying + std::map texture_hdls_; + std::map resolution_hdls_; + + std::map shader_program_hdls_; + + //gl object maps + std::map copyVaos, copyVbos, copyEbos; + + // vertex position, color + const float copyVertices[12] = { + // x y z + -1.0f, -1.0f, -0.0f, + 1.0f, 1.0f, -0.0f, + -1.0f, 1.0f, -0.0f, + 1.0f, -1.0f, -0.0f }; + + const unsigned int copyIndices[6] = { + // 2---,1 + // | .' | + // 0'---3 + 0, 1, 2, 0, 3, 1 }; + + std::map copyFramebuffers_; + std::map copyTextures_; + int index_; + + void* currentSyncObject_ = 0; + static bool firstSync_; +public: + /*! + * Acquires and releases the framebuffer from and to OpenGL. + */ + class CV_EXPORTS FrameBufferScope { + cv::Ptr ctx_; + cv::UMat& m_; +#ifdef HAVE_OPENCL + std::shared_ptr pExecCtx; +#endif + public: + /*! + * Aquires the framebuffer via cl-gl sharing. + * @param ctx The corresponding #FrameBufferContext. + * @param m The UMat to bind the OpenGL framebuffer to. + */ + CV_EXPORTS FrameBufferScope(cv::Ptr ctx, cv::UMat& m) : + ctx_(ctx), m_(m) +#ifdef HAVE_OPENCL + , pExecCtx(std::static_pointer_cast(m.u->allocatorContext)) +#endif + { + CV_Assert(!m.empty()); +#ifdef HAVE_OPENCL + if(pExecCtx) { + CLExecScope_t execScope(*pExecCtx.get()); + ctx_->acquireFromGL(m_); + } else { +#endif + ctx_->acquireFromGL(m_); +#ifdef HAVE_OPENCL + } +#endif + } + /*! + * Releases the framebuffer via cl-gl sharing. + */ + CV_EXPORTS virtual ~FrameBufferScope() { +#ifdef HAVE_OPENCL + if (pExecCtx) { + CLExecScope_t execScope(*pExecCtx.get()); + ctx_->releaseToGL(m_); + } + else { +#endif + ctx_->releaseToGL(m_); +#ifdef HAVE_OPENCL + } +#endif + } + }; + + /*! + * Setups and tears-down OpenGL states. + */ + class CV_EXPORTS GLScope { + cv::Ptr ctx_; + public: + /*! + * Setup OpenGL states. + * @param ctx The corresponding #FrameBufferContext. + */ + CV_EXPORTS GLScope(cv::Ptr ctx, GLenum framebufferTarget = GL_FRAMEBUFFER) : + ctx_(ctx) { + ctx_->begin(framebufferTarget); + } + /*! + * Tear-down OpenGL states. + */ + CV_EXPORTS ~GLScope() { + ctx_->end(); + } + }; + + /*! + * Create a FrameBufferContext with given size. + * @param frameBufferSize The frame buffer size. + */ + FrameBufferContext(V4D& v4d, const cv::Size& frameBufferSize, bool offscreen, + const string& title, int major, int minor, int samples, bool debug, GLFWwindow* rootWindow, cv::Ptr parent, bool root); + + FrameBufferContext(V4D& v4d, const string& title, cv::Ptr other); + + /*! + * Default destructor. + */ + virtual ~FrameBufferContext(); + + cv::Ptr self() { + return self_; + } + + GLuint getFramebufferID(); + GLuint getTextureID(); + /*! + * Get the framebuffer size. + * @return The framebuffer size. + */ + const cv::Size& size() const; + void copyTo(cv::UMat& dst); + void copyFrom(const cv::UMat& src); + void copyToRootWindow(); + + /*! + * Execute function object fn inside a framebuffer context. + * The context acquires the framebuffer from OpenGL (either by up-/download or by cl-gl sharing) + * and provides it to the functon object. This is a good place to use OpenCL + * directly on the framebuffer. + * @param fn A function object that is passed the framebuffer to be read/manipulated. + */ + virtual void execute(std::function fn) override { + if(!getCLExecContext().empty()) { + CLExecScope_t clExecScope(getCLExecContext()); + FrameBufferContext::GLScope glScope(self(), GL_FRAMEBUFFER); + FrameBufferContext::FrameBufferScope fbScope(self(), framebuffer_); + fn(); + } else { + FrameBufferContext::GLScope glScope(self(), GL_FRAMEBUFFER); + FrameBufferContext::FrameBufferScope fbScope(self(), framebuffer_); + fn(); + } + } + cv::Vec2f position(); + float pixelRatioX(); + float pixelRatioY(); + void makeCurrent(); + void makeNoneCurrent(); + bool isResizable(); + void setResizable(bool r); + void setWindowSize(const cv::Size& sz); + cv::Size getWindowSize(); + bool isFullscreen(); + void setFullscreen(bool f); + cv::Size getNativeFrameBufferSize(); + void setVisible(bool v); + bool isVisible(); + void close(); + bool isClosed(); + bool isRoot(); + bool hasParent(); + bool hasRootWindow(); + + /*! + * Blit the framebuffer to the screen + * @param viewport ROI to blit + * @param windowSize The size of the window to blit to + * @param stretch if true stretch the framebuffer to window size + */ + void blitFrameBufferToFrameBuffer(const cv::Rect& srcViewport, const cv::Size& targetFbSize, + GLuint targetFramebufferID = 0, bool stretch = true, bool flipY = false); +protected: + void fence(); + bool wait(const uint64_t& timeout = 0); + CLExecContext_t& getCLExecContext(); + cv::Ptr getV4D(); + int getIndex(); + void setup(); + void teardown(); + /*! + * The UMat used to copy or bind (depending on cl-gl interop capability) the OpenGL framebuffer. + */ + /*! + * The internal framebuffer exposed as OpenGL Texture2D. + * @return The texture object. + */ + cv::ogl::Texture2D& getTexture2D(); + + GLFWwindow* getGLFWWindow() const; +private: + void loadBuffers(const size_t& index); + void loadShader(const size_t& index); + void init(); + CV_EXPORTS cv::UMat& fb(); + /*! + * Setup OpenGL states. + */ + CV_EXPORTS void begin(GLenum framebufferTarget); + /*! + * Tear-down OpenGL states. + */ + CV_EXPORTS void end(); + /*! + * Download the framebuffer to UMat m. + * @param m The target UMat. + */ + void download(cv::UMat& m); + /*! + * Uploat UMat m to the framebuffer. + * @param m The UMat to upload. + */ + void upload(const cv::UMat& m); + /*! + * Acquire the framebuffer using cl-gl sharing. + * @param m The UMat the framebuffer will be bound to. + */ + void acquireFromGL(cv::UMat& m); + /*! + * Release the framebuffer using cl-gl sharing. + * @param m The UMat the framebuffer is bound to. + */ + void releaseToGL(cv::UMat& m); + void toGLTexture2D(cv::UMat& u, cv::ogl::Texture2D& texture); + void fromGLTexture2D(const cv::ogl::Texture2D& texture, cv::UMat& u); + + cv::UMat framebuffer_; + /*! + * The texture bound to the OpenGL framebuffer. + */ + cv::ogl::Texture2D* texture_ = nullptr; +}; +} +} +} + +#endif /* SRC_OPENCV_FRAMEBUFFERCONTEXT_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/gl.hpp b/modules/v4d/include/opencv2/v4d/detail/gl.hpp new file mode 100644 index 000000000..5e927c858 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/gl.hpp @@ -0,0 +1,19 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + + +#ifndef MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_GL_HPP_ +#define MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_GL_HPP_ + +# if !defined(OPENCV_V4D_USE_ES3) +# include "GL/glew.h" +# define GLFW_INCLUDE_NONE +# else +# define GLFW_INCLUDE_ES3 +# define GLFW_INCLUDE_GLEXT +# endif +# include + +#endif /* MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_GL_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/glcontext.hpp b/modules/v4d/include/opencv2/v4d/detail/glcontext.hpp new file mode 100644 index 000000000..bcff37387 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/glcontext.hpp @@ -0,0 +1,43 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_GLCONTEXT_HPP_ +#define SRC_OPENCV_GLCONTEXT_HPP_ + +#include "opencv2/v4d/detail/framebuffercontext.hpp" +#include "opencv2/v4d/detail/gl.hpp" +struct NVGcontext; +namespace cv { +namespace v4d { +namespace detail { +/*! + * Used to setup an OpengLG context + */ +class CV_EXPORTS GLContext : public V4DContext { + const int32_t idx_; + cv::Ptr mainFbContext_; + cv::Ptr glFbContext_; +public: + /*! + * Creates a OpenGL Context + * @param fbContext The framebuffer context + */ + GLContext(const int32_t& idx, cv::Ptr fbContext); + virtual ~GLContext() {}; + /*! + * Execute function object fn inside a gl context. + * The context takes care of setting up opengl states. + * @param fn A function that is passed the size of the framebuffer + * and performs drawing using opengl + */ + virtual void execute(std::function fn) override; + const int32_t& getIndex() const; + cv::Ptr fbCtx(); +}; +} +} +} + +#endif /* SRC_OPENCV_GLCONTEXT_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/imguicontext.hpp b/modules/v4d/include/opencv2/v4d/detail/imguicontext.hpp new file mode 100644 index 000000000..2e2826769 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/imguicontext.hpp @@ -0,0 +1,35 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_IMGUIContext_HPP_ +#define SRC_OPENCV_IMGUIContext_HPP_ + +#include "opencv2/v4d/detail/framebuffercontext.hpp" +#include "imgui.h" + +#include "opencv2/v4d/detail/imguicontext.hpp" + + +namespace cv { +namespace v4d { +namespace detail { +class CV_EXPORTS ImGuiContextImpl { + friend class cv::v4d::V4D; + cv::Ptr mainFbContext_; + ImGuiContext* context_; + std::function renderCallback_; + bool firstFrame_ = true; +public: + CV_EXPORTS ImGuiContextImpl(cv::Ptr fbContext); + CV_EXPORTS void build(std::function fn); +protected: + CV_EXPORTS void makeCurrent(); + CV_EXPORTS void render(bool displayFPS); +}; +} +} +} + +#endif /* SRC_OPENCV_IMGUIContext_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/nanovgcontext.hpp b/modules/v4d/include/opencv2/v4d/detail/nanovgcontext.hpp new file mode 100644 index 000000000..334dbca0d --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/nanovgcontext.hpp @@ -0,0 +1,81 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_NANOVGCONTEXT_HPP_ +#define SRC_OPENCV_NANOVGCONTEXT_HPP_ + +#include "framebuffercontext.hpp" + +struct NVGcontext; +namespace cv { +namespace v4d { +namespace detail { +/*! + * Used to setup a nanovg context + */ +class CV_EXPORTS NanoVGContext : public V4DContext { + cv::Ptr mainFbContext_; + cv::Ptr nvgFbContext_; + NVGcontext* context_; + cv::Size_ scale_ = {1.0f, 1.0f}; +public: + /*! + * Makes sure #NanoVGContext::begin and #NanoVGContext::end are both called + */ + class Scope { + NanoVGContext& ctx_; + public: + /*! + * Setup NanoVG rendering + * @param ctx The corresponding #NanoVGContext + */ + Scope(NanoVGContext& ctx) : + ctx_(ctx) { + ctx_.begin(); + } + /*! + * Tear-down NanoVG rendering + */ + ~Scope() { + ctx_.end(); + } + }; + + /*! + * Creates a NanoVGContext + * @param v4d The V4D object used in conjunction with this context + * @param context The native NVGContext + * @param fbContext The framebuffer context + */ + NanoVGContext(cv::Ptr fbContext); + virtual ~NanoVGContext() {}; + + /*! + * Execute function object fn inside a nanovg context. + * The context takes care of setting up opengl and nanovg states. + * A function object passed like that can use the functions in cv::viz::nvg. + * @param fn A function that is passed the size of the framebuffer + * and performs drawing using cv::viz::nvg + */ + virtual void execute(std::function fn) override; + + void setScale(const cv::Size_& scale); + cv::Ptr fbCtx(); +private: + /*! + * Setup NanoVG context + */ + void begin(); + /*! + * Tear down NanoVG context + */ + void end(); + +}; +} +} +} + +#endif /* SRC_OPENCV_NANOVGCONTEXT_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/resequence.hpp b/modules/v4d/include/opencv2/v4d/detail/resequence.hpp new file mode 100644 index 000000000..110d1c05f --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/resequence.hpp @@ -0,0 +1,38 @@ +#ifndef MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_RESEQUENCE_HPP_ +#define MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_RESEQUENCE_HPP_ + +#include +#include +#include +#include +#include +#include +#include + +namespace cv { +namespace v4d { + + + +class Resequence { + bool finish_ = false; + std::mutex putMtx_; + std::mutex waitMtx_; + std::condition_variable cv_; + uint64_t nextSeq_ = 0; +public: + Resequence() { + } + + virtual ~Resequence() {} + void finish(); + void notify(); + void waitFor(const uint64_t& seq); +}; + +} /* namespace v4d */ +} /* namespace kb */ + + + +#endif /* MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_RESEQUENCE_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/sinkcontext.hpp b/modules/v4d/include/opencv2/v4d/detail/sinkcontext.hpp new file mode 100644 index 000000000..f460d3a30 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/sinkcontext.hpp @@ -0,0 +1,56 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_SINKCONTEXT_HPP_ +#define SRC_OPENCV_SINKCONTEXT_HPP_ + +#include "framebuffercontext.hpp" + +namespace cv { +namespace v4d { +class V4D; +namespace detail { + +/*! + * Provides a context for writing to a Sink + */ +class CV_EXPORTS SinkContext : public V4DContext { + friend class cv::v4d::V4D; + CLExecContext_t context_; + cv::UMat sinkBuffer_; + bool hasContext_ = false; + cv::Ptr mainFbContext_; +public: + /*! + * Create the CLVAContext + * @param fbContext The corresponding framebuffer context + */ + SinkContext(cv::Ptr fbContext); + virtual ~SinkContext() {}; + /*! + * Called to capture from a function object. + * The functor fn is passed a UMat which it writes to which in turn is captured to the framebuffer. + * @param fn The functor that provides the data. + * @return true if successful- + */ + virtual void execute(std::function fn) override; + /*! + * Called to pass the frambuffer to a functor which consumes it (e.g. writes to a video file). + * @param fn The functor that consumes the data, + */ + + /*FIXME only public till https://github.com/opencv/opencv/pull/22780 is resolved. + * required for manual initialization of VideoCapture/VideoWriter + */ + bool hasContext(); + void copyContext(); + CLExecContext_t getCLExecContext(); + cv::UMat& sinkBuffer(); +}; +} +} +} + +#endif /* SRC_OPENCV_SINKCONTEXT_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/sourcecontext.hpp b/modules/v4d/include/opencv2/v4d/detail/sourcecontext.hpp new file mode 100644 index 000000000..48ac611e4 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/sourcecontext.hpp @@ -0,0 +1,56 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_CLVACONTEXT_HPP_ +#define SRC_OPENCV_CLVACONTEXT_HPP_ + +#include "framebuffercontext.hpp" + +namespace cv { +namespace v4d { +class V4D; +namespace detail { + +/*! + * Provides a context for OpenCL-VAAPI sharing + */ +class CV_EXPORTS SourceContext : public V4DContext { + friend class cv::v4d::V4D; + CLExecContext_t context_; + cv::UMat captureBuffer_; + cv::UMat captureBufferRGB_; + bool hasContext_ = false; + cv::Ptr mainFbContext_; + uint64_t currentSeqNr_ = 0; +public: + /*! + * Create the CLVAContext + * @param fbContext The corresponding framebuffer context + */ + SourceContext(cv::Ptr fbContext); + virtual ~SourceContext() {}; + /*! + * Called to capture from a function object. + * The functor fn is passed a UMat which it writes to which in turn is captured to the framebuffer. + * @param fn The functor that provides the data. + * @return true if successful- + */ + virtual void execute(std::function fn) override; + + uint64_t sequenceNumber(); + + /*FIXME only public till https://github.com/opencv/opencv/pull/22780 is resolved. + * required for manual initialization of VideoCapture/VideoWriter + */ + bool hasContext(); + void copyContext(); + CLExecContext_t getCLExecContext(); + cv::UMat& sourceBuffer(); +}; +} +} +} + +#endif /* SRC_OPENCV_CLVACONTEXT_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/timetracker.hpp b/modules/v4d/include/opencv2/v4d/detail/timetracker.hpp new file mode 100644 index 000000000..412d1a312 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/timetracker.hpp @@ -0,0 +1,139 @@ +#ifndef TIME_TRACKER_HPP_ +#define TIME_TRACKER_HPP_ + +#include +#include +#include +#include +#include +#include +#include +#include + +using std::ostream; +using std::stringstream; +using std::string; +using std::map; +using std::chrono::microseconds; +using std::mutex; + +struct CV_EXPORTS TimeInfo { + long totalCnt_ = 0; + long totalTime_ = 0; + long iterCnt_ = 0; + long iterTime_ = 0; + long last_ = 0; + + void add(size_t t) { + last_ = t; + totalTime_ += t; + iterTime_ += t; + ++totalCnt_; + ++iterCnt_; + + if (totalCnt_ == std::numeric_limits::max() || totalTime_ == std::numeric_limits::max()) { + totalCnt_ = 0; + totalTime_ = 0; + } + + if (iterCnt_ == std::numeric_limits::max() || iterTime_ == std::numeric_limits::max()) { + iterCnt_ = 0; + iterTime_ = 0; + } + } + + void newCount() { + iterCnt_ = 0; + iterTime_ = 0; + } + + string str() const { + stringstream ss; + ss << (totalTime_ / 1000.0) / totalCnt_ << "ms = (" << totalTime_ / 1000.0 << '\\' << totalCnt_ << ")\t"; + ss << (iterTime_ / 1000.0) / iterCnt_ << "ms = (" << iterTime_ / 1000.0 << '\\' << iterCnt_ << ")\t"; + return ss.str(); + } +}; + +inline std::ostream& operator<<(ostream &os, TimeInfo &ti) { + os << (ti.totalTime_ / 1000.0) / ti.totalCnt_ << "ms = (" << ti.totalTime_ / 1000.0 << '\\' << ti.totalCnt_ << ")\t"; + os << (ti.iterTime_ / 1000.0) / ti.iterCnt_ << "ms = (" << ti.iterTime_ / 1000.0 << '\\' << ti.iterCnt_ << ")"; + return os; +} + +class CV_EXPORTS TimeTracker { +private: + static TimeTracker *instance_; + mutex mapMtx_; + map tiMap_; + bool enabled_; + TimeTracker(); +public: + virtual ~TimeTracker(); + + map& getMap() { + return tiMap_; + } + + template void execute(const string &name, F const &func) { + auto start = std::chrono::system_clock::now(); + func(); + auto duration = std::chrono::duration_cast(std::chrono::system_clock::now() - start); + std::unique_lock lock(mapMtx_); + tiMap_[name].add(duration.count()); + } + + template size_t measure(F const &func) { + auto start = std::chrono::system_clock::now(); + func(); + auto duration = std::chrono::duration_cast(std::chrono::system_clock::now() - start); + return duration.count(); + } + + bool isEnabled() { + return enabled_; + } + + void setEnabled(bool e) { + enabled_ = e; + } + + void print(ostream &os) { + std::unique_lock lock(mapMtx_); + stringstream ss; + ss << "Time tracking info: " << std::endl; + for (auto it : tiMap_) { + ss << "\t" << it.first << ": " << it.second << std::endl; + } + + os << ss.str(); + } + + void reset() { + std::unique_lock lock(mapMtx_); + tiMap_.clear(); + } + + static TimeTracker* getInstance() { + if (instance_ == NULL) + instance_ = new TimeTracker(); + + return instance_; + } + + static void destroy() { + if (instance_) + delete instance_; + + instance_ = NULL; + } + + void newCount() { + std::unique_lock lock(mapMtx_); + for (auto& pair : getMap()) { + pair.second.newCount(); + } + } +}; + +#endif /* TIME_TRACKER_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/detail/transaction.hpp b/modules/v4d/include/opencv2/v4d/detail/transaction.hpp new file mode 100644 index 000000000..c6c46752f --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/detail/transaction.hpp @@ -0,0 +1,112 @@ +#ifndef MODULES_V4D_SRC_BACKEND_HPP_ +#define MODULES_V4D_SRC_BACKEND_HPP_ + +#include "context.hpp" + +#include +#include +#include +#include +#include + +namespace cv { +namespace v4d { + +class Transaction { +private: + cv::Ptr ctx_; +public: + virtual ~Transaction() {} + virtual void perform() = 0; + virtual bool enabled() = 0; + virtual bool isPredicate() = 0; + virtual bool lock() = 0; + + void setContext(cv::Ptr ctx) { + ctx_ = ctx; + } + + cv::Ptr getContext() { + return ctx_; + } +}; + +namespace detail { + +template +class TransactionImpl : public Transaction +{ + static_assert(sizeof...(Ts) == 0 || (!(std::is_rvalue_reference_v && ...))); +private: + bool lock_; + F f; + std::tuple args; +public: + template && ...))>> + TransactionImpl(bool lock, FwdF&& func, FwdTs&&... fwdArgs) + : lock_(lock), + f(std::forward(func)), + args{std::forward_as_tuple(fwdArgs...)} + {} + + virtual ~TransactionImpl() override + {} + + virtual void perform() override + { + std::apply(f, args); + } + + template + typename std::enable_if::type enabled() { + return std::apply(f, args); + } + + template + typename std::enable_if::type enabled() { + return false; + } + + virtual bool enabled() override { + return enabled, bool>>(); + } + + template + typename std::enable_if::type isPredicate() { + return true; + } + + template + typename std::enable_if::type isPredicate() { + return false; + } + + virtual bool isPredicate() override { + return isPredicate, bool>>(); + } + + virtual bool lock() override { + return lock_; + } +}; +} + +template +cv::Ptr make_transaction(bool lock, F f, Args&&... args) { + return cv::Ptr(dynamic_cast(new detail::TransactionImpl, std::remove_cv_t...> + (lock, std::forward(f), std::forward(args)...))); +} + + +template +cv::Ptr make_transaction(bool lock, F f, Tfb&& fb, Args&&... args) { + return cv::Ptr(dynamic_cast(new detail::TransactionImpl, std::remove_cv_t, std::remove_cv_t...> + (lock, std::forward(f), std::forward(fb), std::forward(args)...))); +} + + +} +} + +#endif /* MODULES_V4D_SRC_BACKEND_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/events.hpp b/modules/v4d/include/opencv2/v4d/events.hpp new file mode 100644 index 000000000..1fba2e69a --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/events.hpp @@ -0,0 +1,470 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_EVENTS_HPP_ +#define MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_EVENTS_HPP_ + +#include +#include + +namespace cv { +namespace v4d { +namespace event { + +inline static thread_local GLFWwindow* current_window = nullptr; + +struct WindowState { + cv::Size size; + cv::Point position; + bool focused; +}; + +inline static thread_local WindowState window_state; + +static GLFWwindow* get_current_glfw_window() { + if(current_window == nullptr) + CV_Error(cv::Error::StsBadArg, "No current glfw window set for event handling. You probably tried to call one of the cv::v4d::event functions outside a context-call."); + return current_window; +} + +static void set_current_glfw_window(GLFWwindow* window) { + current_window = window; +} + +// Define an enum class for the V4D keys +enum class Key { + KEY_A, + KEY_B, + KEY_C, + KEY_D, + KEY_E, + KEY_F, + KEY_G, + KEY_H, + KEY_I, + KEY_J, + KEY_K, + KEY_L, + KEY_M, + KEY_N, + KEY_O, + KEY_P, + KEY_Q, + KEY_R, + KEY_S, + KEY_T, + KEY_U, + KEY_V, + KEY_W, + KEY_X, + KEY_Y, + KEY_Z, + KEY_0, + KEY_1, + KEY_2, + KEY_3, + KEY_4, + KEY_5, + KEY_6, + KEY_7, + KEY_8, + KEY_9, + KEY_SPACE, + KEY_ENTER, + KEY_BACKSPACE, + KEY_TAB, + KEY_ESCAPE, + KEY_UP, + KEY_DOWN, + KEY_LEFT, + KEY_RIGHT, + KEY_HOME, + KEY_END, + KEY_PAGE_UP, + KEY_PAGE_DOWN, + KEY_INSERT, + KEY_DELETE, + KEY_F1, + KEY_F2, + KEY_F3, + KEY_F4, + KEY_F5, + KEY_F6, + KEY_F7, + KEY_F8, + KEY_F9, + KEY_F10, + KEY_F11, + KEY_F12 +}; + +enum class KeyEventType { + NONE, + PRESS, + RELEASE, + REPEAT, + HOLD +}; + +inline static thread_local std::map key_states; + +constexpr Key get_v4d_key(int glfw_key) { + switch (glfw_key) { + case GLFW_KEY_A: return Key::KEY_A; + case GLFW_KEY_B: return Key::KEY_B; + case GLFW_KEY_C: return Key::KEY_C; + case GLFW_KEY_D: return Key::KEY_D; + case GLFW_KEY_E: return Key::KEY_E; + case GLFW_KEY_F: return Key::KEY_F; + case GLFW_KEY_G: return Key::KEY_G; + case GLFW_KEY_H: return Key::KEY_H; + case GLFW_KEY_I: return Key::KEY_I; + case GLFW_KEY_J: return Key::KEY_J; + case GLFW_KEY_K: return Key::KEY_K; + case GLFW_KEY_L: return Key::KEY_L; + case GLFW_KEY_M: return Key::KEY_M; + case GLFW_KEY_N: return Key::KEY_N; + case GLFW_KEY_O: return Key::KEY_O; + case GLFW_KEY_P: return Key::KEY_P; + case GLFW_KEY_Q: return Key::KEY_Q; + case GLFW_KEY_R: return Key::KEY_R; + case GLFW_KEY_S: return Key::KEY_S; + case GLFW_KEY_T: return Key::KEY_T; + case GLFW_KEY_U: return Key::KEY_U; + case GLFW_KEY_V: return Key::KEY_V; + case GLFW_KEY_W: return Key::KEY_W; + case GLFW_KEY_X: return Key::KEY_X; + case GLFW_KEY_Y: return Key::KEY_Y; + case GLFW_KEY_Z: return Key::KEY_Z; + case GLFW_KEY_0: return Key::KEY_0; + case GLFW_KEY_1: return Key::KEY_1; + case GLFW_KEY_2: return Key::KEY_2; + case GLFW_KEY_3: return Key::KEY_3; + case GLFW_KEY_4: return Key::KEY_4; + case GLFW_KEY_5: return Key::KEY_5; + case GLFW_KEY_6: return Key::KEY_6; + case GLFW_KEY_7: return Key::KEY_7; + case GLFW_KEY_8: return Key::KEY_8; + case GLFW_KEY_9: return Key::KEY_9; + case GLFW_KEY_SPACE: return Key::KEY_SPACE; + case GLFW_KEY_ENTER: return Key::KEY_ENTER; + case GLFW_KEY_BACKSPACE: return Key::KEY_BACKSPACE; + case GLFW_KEY_TAB: return Key::KEY_TAB; + case GLFW_KEY_ESCAPE: return Key::KEY_ESCAPE; + case GLFW_KEY_UP: return Key::KEY_UP; + case GLFW_KEY_DOWN: return Key::KEY_DOWN; + case GLFW_KEY_LEFT: return Key::KEY_LEFT; + case GLFW_KEY_RIGHT: return Key::KEY_RIGHT; + case GLFW_KEY_END: return Key::KEY_END; + case GLFW_KEY_PAGE_UP: return Key::KEY_PAGE_UP; + case GLFW_KEY_PAGE_DOWN: return Key::KEY_PAGE_DOWN; + case GLFW_KEY_INSERT: return Key::KEY_INSERT; + case GLFW_KEY_DELETE: return Key::KEY_DELETE; + case GLFW_KEY_F1: return Key::KEY_F1; + case GLFW_KEY_F2: return Key::KEY_F2; + case GLFW_KEY_F3: return Key::KEY_F3; + case GLFW_KEY_F4: return Key::KEY_F4; + case GLFW_KEY_F5: return Key::KEY_F5; + case GLFW_KEY_F6: return Key::KEY_F6; + case GLFW_KEY_F7: return Key::KEY_F7; + case GLFW_KEY_F8: return Key::KEY_F8; + case GLFW_KEY_F9: return Key::KEY_F9; + case GLFW_KEY_F10: return Key::KEY_F10; + case GLFW_KEY_F11: return Key::KEY_F11; + case GLFW_KEY_F12: return Key::KEY_F12; + default: + CV_Error_(cv::Error::StsBadArg, ("Invalid key: %d. Please ensure the key is within the valid range.", glfw_key)); + return Key::KEY_F12; + } +} + +static KeyEventType get_key_event_type(int key) { + Key v4d_key = get_v4d_key(key); + int state = glfwGetKey(get_current_glfw_window(), key); + switch (state) { + case GLFW_PRESS: + key_states[v4d_key] = true; + return KeyEventType::PRESS; + case GLFW_RELEASE: + key_states[v4d_key] = false; + return KeyEventType::RELEASE; + case GLFW_REPEAT: + return KeyEventType::REPEAT; + default: + return KeyEventType::NONE; + } +} + +static KeyEventType get_key_hold_event(Key key) { + if (key_states[key]) { + return KeyEventType::HOLD; + } else { + return KeyEventType::NONE; + } +} + +// Define an enum class for the V4D mouse buttons +enum class MouseButton { + LEFT, + RIGHT, + MIDDLE, + BUTTON_4, + BUTTON_5, + BUTTON_6, + BUTTON_7, + BUTTON_8 +}; + +enum class MouseEventType { + NONE, + PRESS, + RELEASE, + MOVE, + SCROLL, + DRAG_START, + DRAG, + DRAG_END, + HOVER_ENTER, + HOVER_EXIT, + DOUBLE_CLICK +}; + +// Define a static function that returns the mouse position as a cv::Point2d +static cv::Point2d get_mouse_position() { + // Declare variables to store the mouse position + double x, y; + // Get the mouse position using glfwGetCursorPos + glfwGetCursorPos(get_current_glfw_window(), &x, &y); + // Return the mouse position as a cv::Point2d + return cv::Point2d(x, y); +} + +inline static thread_local std::map button_states; +inline static thread_local cv::Point2d last_position = get_mouse_position(); +inline static thread_local cv::Point2d scroll_offset(0, 0); + +static void scroll_callback(GLFWwindow* window, double xoffset, double yoffset) +{ + // Update the scroll offset + scroll_offset = cv::Point2d(xoffset, yoffset); +} + +constexpr static MouseButton get_v4d_mouse_button(int glfw_button) { + switch (glfw_button) { + case GLFW_MOUSE_BUTTON_LEFT: return MouseButton::LEFT; + case GLFW_MOUSE_BUTTON_RIGHT: return MouseButton::RIGHT; + case GLFW_MOUSE_BUTTON_MIDDLE: return MouseButton::MIDDLE; + case GLFW_MOUSE_BUTTON_4: return MouseButton::BUTTON_4; + case GLFW_MOUSE_BUTTON_5: return MouseButton::BUTTON_5; + case GLFW_MOUSE_BUTTON_6: return MouseButton::BUTTON_6; + case GLFW_MOUSE_BUTTON_7: return MouseButton::BUTTON_7; + case GLFW_MOUSE_BUTTON_8: return MouseButton::BUTTON_8; + default: CV_Error_(cv::Error::StsBadArg, ("Invalid mouse button: %d. Please ensure the button is within the valid range.", glfw_button)); + } +} + +static MouseEventType get_mouse_event_type(int button) { + MouseButton v4d_button = get_v4d_mouse_button(button); + int state = glfwGetMouseButton(get_current_glfw_window(), button); + switch (state) { + case GLFW_PRESS: + button_states[v4d_button] = true; + return MouseEventType::PRESS; + case GLFW_RELEASE: + button_states[v4d_button] = false; + return MouseEventType::RELEASE; + default: + return MouseEventType::NONE; + } +} + +static cv::Point2d get_mouse_scroll_offset() { + return scroll_offset; +} + +static MouseEventType get_mouse_scroll_event() { + cv::Point2d current_offset = get_mouse_scroll_offset(); + if (current_offset != last_position) { + last_position = current_offset; + return MouseEventType::SCROLL; + } else { + return MouseEventType::NONE; + } +} + +static MouseEventType get_mouse_move_event() { + cv::Point2d current_position = get_mouse_position(); + if (current_position != last_position) { + last_position = current_position; + return MouseEventType::MOVE; + } else { + return MouseEventType::NONE; + } +} + +static MouseEventType get_mouse_drag_event(MouseButton button) { + cv::Point2d current_position = get_mouse_position(); + if (button_states[button] && current_position != last_position) { + last_position = current_position; + return MouseEventType::DRAG; + } else { + return MouseEventType::NONE; + } +} + +static MouseEventType get_mouse_hover_event() { + cv::Point2d current_position = get_mouse_position(); + if (current_position != last_position) { + last_position = current_position; + return MouseEventType::HOVER_ENTER; + } else { + return MouseEventType::HOVER_EXIT; + } +} + +enum class WindowEvent { + NONE, + RESIZE, + MOVE, + FOCUS, + UNFOCUS, + CLOSE +}; + +static WindowEvent get_window_resize_event() { + static WindowState last_state = window_state; + + if (window_state.size != last_state.size) { + last_state.size = window_state.size; + return WindowEvent::RESIZE; + } else { + return WindowEvent::NONE; + } +} + +static WindowEvent get_window_move_event() { + static WindowState last_state = window_state; + + if (window_state.position != last_state.position) { + last_state.position = window_state.position; + return WindowEvent::MOVE; + } else { + return WindowEvent::NONE; + } +} + +static WindowEvent get_window_focus_event() { + static WindowState last_state = window_state; + + if (window_state.focused && !last_state.focused) { + last_state.focused = window_state.focused; + return WindowEvent::FOCUS; + } else if (!window_state.focused && last_state.focused) { + last_state.focused = window_state.focused; + return WindowEvent::UNFOCUS; + } else { + return WindowEvent::NONE; + } +} + +static cv::Size get_window_size() { + int width, height; + glfwGetWindowSize(get_current_glfw_window(), &width, &height); + return cv::Size(width, height); +} + +static cv::Point get_window_position() { + int x, y; + glfwGetWindowPos(get_current_glfw_window(), &x, &y); + return cv::Point(x, y); +} + +static bool get_window_focus() { + int focused = glfwGetWindowAttrib(get_current_glfw_window(), GLFW_FOCUSED); + return focused; +} + +static void initialize_callbacks(GLFWwindow* window) { + glfwSetScrollCallback(window, scroll_callback); +} + +// Define an enum class for the V4D joystick buttons +enum class JoystickButton { + BUTTON_A, + BUTTON_B, + BUTTON_X, + BUTTON_Y, + BUTTON_LB, + BUTTON_RB, + BUTTON_BACK, + BUTTON_START, + BUTTON_GUIDE, + BUTTON_LEFT_THUMB, + BUTTON_RIGHT_THUMB, + BUTTON_DPAD_UP, + BUTTON_DPAD_RIGHT, + BUTTON_DPAD_DOWN, + BUTTON_DPAD_LEFT +}; + +// Define an enum class for the V4D joystick axes +enum class JoystickAxis { + AXIS_LEFT_X, + AXIS_LEFT_Y, + AXIS_RIGHT_X, + AXIS_RIGHT_Y, + AXIS_LEFT_TRIGGER, + AXIS_RIGHT_TRIGGER +}; + +// Define a static function that returns the state of a joystick button +static bool get_joystick_button_state(int joystick, JoystickButton button) { + int count; + const unsigned char* buttons = glfwGetJoystickButtons(joystick, &count); + if (buttons == nullptr) { + CV_Error(cv::Error::StsBadArg, "Failed to get joystick buttons. Please ensure the joystick is connected and working properly."); + } + return buttons[static_cast(button)]; +} + +// Define a static function that returns the name of a joystick +static const char* get_joystick_name(int joystick) { + const char* name = glfwGetJoystickName(joystick); + if (name == nullptr) { + CV_Error(cv::Error::StsBadArg, "Failed to get joystick name. Please ensure the joystick is connected and working properly."); + } + return name; +} + +// Define a static function that returns whether a joystick is present +static bool is_joystick_present(int joystick) { + int present = glfwJoystickPresent(joystick); + if (present != GLFW_TRUE && present != GLFW_FALSE) { + CV_Error(cv::Error::StsBadArg, "Failed to check if joystick is present. Please ensure the joystick is connected and working properly."); + } + return present; +} + +// Define a static function that sets the clipboard string +static void set_clipboard_string(const char* string) { + if (string == nullptr) { + CV_Error(cv::Error::StsNullPtr, "Cannot set clipboard string to null. Please provide a valid string."); + } + glfwSetClipboardString(get_current_glfw_window(), string); +} + +// Define a static function that gets the clipboard string +static const char* get_clipboard_string() { + const char* string = glfwGetClipboardString(get_current_glfw_window()); + if (string == nullptr) { + CV_Error(cv::Error::StsNullPtr, "Failed to get clipboard string. Please ensure there is a string in the clipboard."); + } + return string; +} + +} +} +} +#endif // MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_EVENTS_HPP_ diff --git a/modules/v4d/include/opencv2/v4d/nvg.hpp b/modules/v4d/include/opencv2/v4d/nvg.hpp new file mode 100644 index 000000000..37012f71b --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/nvg.hpp @@ -0,0 +1,509 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_V4D_NVG_HPP_ +#define SRC_OPENCV_V4D_NVG_HPP_ + +#include "opencv2/v4d/v4d.hpp" +#include +#include +#include "nanovg.h" +struct NVGcontext; + +namespace cv { +namespace v4d { +/*! + * In general please refer to https://github.com/memononen/nanovg/blob/master/src/nanovg.h for reference. + */ +namespace nvg { +/*! + * Equivalent of a NVGtextRow. + */ +struct CV_EXPORTS TextRow: public NVGtextRow { +}; + +/*! + * Equivalent of a NVGglyphPosition. + */ +struct CV_EXPORTS GlyphPosition: public NVGglyphPosition { +}; + +/*! + * Equivalent of a NVGPaint. Converts back and forth between the two representations (Paint/NVGPaint). + */ +struct CV_EXPORTS Paint { + Paint() { + } + Paint(const NVGpaint& np); + NVGpaint toNVGpaint(); + + float xform[6]; + float extent[2]; + float radius = 0; + float feather = 0; + cv::Scalar innerColor; + cv::Scalar outerColor; + int image = 0; +}; + +/*! + * Internals of the NanoVG wrapper + */ +namespace detail { +/*! + * Internal NanoVG singleton that wraps all NanoVG functions. + */ +class NVG { +private: + friend class V4D; + static thread_local NVG* nvg_instance_; + NVGcontext* ctx_ = nullptr; + NVG(NVGcontext* ctx) : + ctx_(ctx) { + } +public: + /*! + * Initialize the current NVG object; + * @param ctx The NVGcontext to create the NVG object from. + */ + static void initializeContext(NVGcontext* ctx); + /*! + * Get the current NVGcontext. + * @return The current NVGcontext context. + */ + static NVG* getCurrentContext(); + + /*! + * Get the underlying NVGcontext. + * @return The underlying NVGcontext. + */ + NVGcontext* getContext() { + assert(ctx_ != nullptr); + return ctx_; + } +public: + int createFont(const char* name, const char* filename); + int createFontMem(const char* name, unsigned char* data, int ndata, int freeData); + int findFont(const char* name); + int addFallbackFontId(int baseFont, int fallbackFont); + int addFallbackFont(const char* baseFont, const char* fallbackFont); + void fontSize(float size); + void fontBlur(float blur); + void textLetterSpacing(float spacing); + void textLineHeight(float lineHeight); + void textAlign(int align); + void fontFaceId(int font); + void fontFace(const char* font); + float text(float x, float y, const char* string, const char* end); + void textBox(float x, float y, float breakRowWidth, const char* string, const char* end); + float textBounds(float x, float y, const char* string, const char* end, float* bounds); + void textBoxBounds(float x, float y, float breakRowWidth, const char* string, const char* end, + float* bounds); + int textGlyphPositions(float x, float y, const char* string, const char* end, + GlyphPosition* positions, int maxPositions); + void textMetrics(float* ascender, float* descender, float* lineh); + int textBreakLines(const char* string, const char* end, float breakRowWidth, TextRow* rows, + int maxRows); + + void save(); + void restore(); + void reset(); + +// void shapeAntiAlias(int enabled); + void strokeColor(const cv::Scalar& bgra); + void strokePaint(Paint paint); + void fillColor(const cv::Scalar& bgra); + void fillPaint(Paint paint); + void miterLimit(float limit); + void strokeWidth(float size); + void lineCap(int cap); + void lineJoin(int join); + void globalAlpha(float alpha); + + void resetTransform(); + void transform(float a, float b, float c, float d, float e, float f); + void translate(float x, float y); + void rotate(float angle); + void skewX(float angle); + void skewY(float angle); + void scale(float x, float y); + void currentTransform(float* xform); + void transformIdentity(float* dst); + void transformTranslate(float* dst, float tx, float ty); + void transformScale(float* dst, float sx, float sy); + void transformRotate(float* dst, float a); + void transformSkewX(float* dst, float a); + void transformSkewY(float* dst, float a); + void transformMultiply(float* dst, const float* src); + void transformPremultiply(float* dst, const float* src); + int transformInverse(float* dst, const float* src); + void transformPoint(float* dstx, float* dsty, const float* xform, float srcx, float srcy); + + float degToRad(float deg); + float radToDeg(float rad); + + int createImage(const char* filename, int imageFlags); + int createImageMem(int imageFlags, unsigned char* data, int ndata); + int createImageRGBA(int w, int h, int imageFlags, const unsigned char* data); + void updateImage(int image, const unsigned char* data); + void imageSize(int image, int* w, int* h); + void deleteImage(int image); + + void beginPath(); + void moveTo(float x, float y); + void lineTo(float x, float y); + void bezierTo(float c1x, float c1y, float c2x, float c2y, float x, float y); + void quadTo(float cx, float cy, float x, float y); + void arcTo(float x1, float y1, float x2, float y2, float radius); + void closePath(); + void pathWinding(int dir); + void arc(float cx, float cy, float r, float a0, float a1, int dir); + void rect(float x, float y, float w, float h); + void roundedRect(float x, float y, float w, float h, float r); + void roundedRectVarying(float x, float y, float w, float h, float radTopLeft, float radTopRight, + float radBottomRight, float radBottomLeft); + void ellipse(float cx, float cy, float rx, float ry); + void circle(float cx, float cy, float r); + void fill(); + void stroke(); + + Paint linearGradient(float sx, float sy, float ex, float ey, const cv::Scalar& icol, + const cv::Scalar& ocol); + Paint boxGradient(float x, float y, float w, float h, float r, float f, const cv::Scalar& icol, + const cv::Scalar& ocol); + Paint radialGradient(float cx, float cy, float inr, float outr, const cv::Scalar& icol, + const cv::Scalar& ocol); + Paint imagePattern(float ox, float oy, float ex, float ey, float angle, int image, float alpha); + void scissor(float x, float y, float w, float h); + void intersectScissor(float x, float y, float w, float h); + void resetScissor(); +}; +} // namespace detail + +/*! + * A forward to nvgCreateFont. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int createFont(const char* name, const char* filename); +/*! + * A forward to nvgCreateFontMem. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int createFontMem(const char* name, unsigned char* data, int ndata, int freeData); +/*! + * A forward to nvgFindFont. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int findFont(const char* name); +/*! + * A forward to nvgAddFallbackFontId. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int addFallbackFontId(int baseFont, int fallbackFont); +/*! + * A forward to nvgAddFallbackFont. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int addFallbackFont(const char* baseFont, const char* fallbackFont); +/*! + * A forward to nvgFontSize. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void fontSize(float size); +/*! + * A forward to nvgFontBlur. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void fontBlur(float blur); +/*! + * A forward to nvgTextLetterSpacing. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void textLetterSpacing(float spacing); +/*! + * A forward to nvgTextLineHeight. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void textLineHeight(float lineHeight); +/*! + * A forward to nvgTextAlign. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void textAlign(int align); +/*! + * A forward to nvgFontFaceId. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void fontFaceId(int font); +/*! + * A forward to nvgFontFace. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void fontFace(const char* font); +/*! + * A forward to nvgText. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS float text(float x, float y, const char* string, const char* end); +/*! + * A forward to nvgTextBox. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void textBox(float x, float y, float breakRowWidth, const char* string, const char* end); +/*! + * A forward to nvgTextBounds. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS float textBounds(float x, float y, const char* string, const char* end, float* bounds); +/*! + * A forward to nvgTextBoxBounds. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void textBoxBounds(float x, float y, float breakRowWidth, const char* string, const char* end, + float* bounds); +/*! + * A forward to nvgTextGlyphPositions. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int textGlyphPositions(float x, float y, const char* string, const char* end, + GlyphPosition* positions, int maxPositions); +/*! + * A forward to nvgTextMetrics. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void textMetrics(float* ascender, float* descender, float* lineh); +/*! + * A forward to nvgTextBreakLines. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int textBreakLines(const char* string, const char* end, float breakRowWidth, TextRow* rows, + int maxRows); +/*! + * A forward to nvgSave. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void save(); +/*! + * A forward to nvgRestore. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void restore(); +/*! + * A forward to nvgReset. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void reset(); +///*! +// * A forward to nvgShapeAntiAlias. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h +// */ +//CV_EXPORTS void shapeAntiAlias(int enabled); +/*! + * A forward to nvgStrokeColor. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void strokeColor(const cv::Scalar& bgra); +/*! + * A forward to nvgStrokePaint. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void strokePaint(Paint paint); +/*! + * A forward to nvgFillColor. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void fillColor(const cv::Scalar& color); +/*! + * A forward to nvgFillPaint. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void fillPaint(Paint paint); +/*! + * A forward to nvgMiterLimit. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void miterLimit(float limit); +/*! + * A forward to nvgStrokeWidth. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void strokeWidth(float size); +/*! + * A forward to nvgLineCap. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void lineCap(int cap); +/*! + * A forward to nvgLineJoin. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void lineJoin(int join); +/*! + * A forward to nvgGlobalAlpha. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void globalAlpha(float alpha); + +/*! + * A forward to nvgResetTransform. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void resetTransform(); +/*! + * A forward to nvgTransform. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transform(float a, float b, float c, float d, float e, float f); +/*! + * A forward to nvgTranslate. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void translate(float x, float y); +/*! + * A forward to nvgRotate. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void rotate(float angle); +/*! + * A forward to nvgSkewX. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void skewX(float angle); +/*! + * A forward to nvgSkewY. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void skewY(float angle); +/*! + * A forward to nvgScale. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void scale(float x, float y); +/*! + * A forward to nvgCurrentTransform. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void currentTransform(float* xform); +/*! + * A forward to nvgTransformIdentity. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformIdentity(float* dst); +/*! + * A forward to nvgTransformTranslate. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformTranslate(float* dst, float tx, float ty); +/*! + * A forward to nvgTransformScale. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformScale(float* dst, float sx, float sy); +/*! + * A forward to nvgTransformRotate. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformRotate(float* dst, float a); +/*! + * A forward to nvgTransformSkewX. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformSkewX(float* dst, float a); +/*! + * A forward to nvgTransformSkewY. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformSkewY(float* dst, float a); +/*! + * A forward to nvgTransformMultiply. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformMultiply(float* dst, const float* src); +/*! + * A forward to nvgTransformPremultiply. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformPremultiply(float* dst, const float* src); +/*! + * A forward to nvgTransformInverse. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS int transformInverse(float* dst, const float* src); +/*! + * A forward to nvgTransformPoint. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void transformPoint(float* dstx, float* dsty, const float* xform, float srcx, float srcy); + +/*! + * A forward to nvgDegToRad. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS float degToRad(float deg); +/*! + * A forward to nvgRadToDeg. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS float radToDeg(float rad); + +CV_EXPORTS int createImage(const char* filename, int imageFlags); +CV_EXPORTS int createImageMem(int imageFlags, unsigned char* data, int ndata); +CV_EXPORTS int createImageRGBA(int w, int h, int imageFlags, const unsigned char* data); +CV_EXPORTS void updateImage(int image, const unsigned char* data); +CV_EXPORTS void imageSize(int image, int* w, int* h); +CV_EXPORTS void deleteImage(int image); + +/*! + * A forward to nvgBeginPath. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void beginPath(); +/*! + * A forward to nvgMoveTo. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void moveTo(float x, float y); +/*! + * A forward to nvgLineTo. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void lineTo(float x, float y); +/*! + * A forward to nvgBezierTo. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void bezierTo(float c1x, float c1y, float c2x, float c2y, float x, float y); +/*! + * A forward to nvgQuadTo. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void quadTo(float cx, float cy, float x, float y); +/*! + * A forward to nvgArcTo. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void arcTo(float x1, float y1, float x2, float y2, float radius); +/*! + * A forward to nvgClosePath. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void closePath(); +/*! + * A forward to nvgPathWinding. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void pathWinding(int dir); +/*! + * A forward to nvgArc. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void arc(float cx, float cy, float r, float a0, float a1, int dir); +/*! + * A forward to nvgRect. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void rect(float x, float y, float w, float h); +/*! + * A forward to nvgRoundedRect. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void roundedRect(float x, float y, float w, float h, float r); +/*! + * A forward to nvgRoundedRectVarying. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void roundedRectVarying(float x, float y, float w, float h, float radTopLeft, float radTopRight, + float radBottomRight, float radBottomLeft); +/*! + * A forward to nvgEllipse. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void ellipse(float cx, float cy, float rx, float ry); +/*! + * A forward to nvgCircle. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void circle(float cx, float cy, float r); +/*! + * A forward to nvgFill. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void fill(); +/*! + * A forward to nvgStroke. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void stroke(); + +/*! + * A forward to nvgLinearGradient. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS Paint linearGradient(float sx, float sy, float ex, float ey, const cv::Scalar& icol, + const cv::Scalar& ocol); +/*! + * A forward to nvgBoxGradient. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS Paint boxGradient(float x, float y, float w, float h, float r, float f, const cv::Scalar& icol, + const cv::Scalar& ocol); +/*! + * A forward to nvgRadialGradient. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS Paint radialGradient(float cx, float cy, float inr, float outr, const cv::Scalar& icol, + const cv::Scalar& ocol); +/*! + * A forward to nvgImagePattern. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS Paint imagePattern(float ox, float oy, float ex, float ey, float angle, int image, float alpha); +/*! + * A forward to nvgScissor. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void scissor(float x, float y, float w, float h); +/*! + * A forward to nvgIntersectScissor. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void intersectScissor(float x, float y, float w, float h); +/*! + * A forward to nvgRresetScissor. See https://github.com/memononen/nanovg/blob/master/src/nanovg.h + */ +CV_EXPORTS void resetScissor(); + +CV_EXPORTS void clear(const cv::Scalar& bgra = cv::Scalar(0, 0, 0, 255)); +} +} +} + +#endif /* SRC_OPENCV_V4D_NVG_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/scene.hpp b/modules/v4d/include/opencv2/v4d/scene.hpp new file mode 100644 index 000000000..3e95b987e --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/scene.hpp @@ -0,0 +1,198 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef MODULES_V4D_SRC_SCENE_HPP_ +#define MODULES_V4D_SRC_SCENE_HPP_ + +#include "v4d.hpp" +#include +#include +#include + +namespace cv { +namespace v4d { +namespace gl { + +cv::Vec3f rotate3D(const cv::Vec3f& point, const cv::Vec3f& center, const cv::Vec3f& rotation); +cv::Matx44f perspective(float fov, float aspect, float zNear, float zFar); +cv::Matx44f lookAt(cv::Vec3f eye, cv::Vec3f center, cv::Vec3f up); +cv::Matx44f modelView(const cv::Vec3f& translation, const cv::Vec3f& rotationVec, const cv::Vec3f& scaleVec); + +class Scene { +public: + enum RenderMode { + DEFAULT = 0, + WIREFRAME = 1, + POINTCLOUD = 2, + }; +private: + Assimp::Importer importer_; + const aiScene* scene_ = nullptr; + RenderMode mode_ = DEFAULT; + GLuint shaderHandles_[3] = {0, 0, 0}; + cv::Vec3f lightPos_ = {1.2f, 1.0f, 2.0f}; + cv::Vec3f viewPos_ = {0.0, 0.0, 0.0}; + + cv::Vec3f autoCenter_, size_; + float autoScale_ = 1; + + const string vertexShaderSource_ = R"( + #version 300 es + layout(location = 0) in vec3 aPos; + out vec3 fragPos; + uniform mat4 model; + uniform mat4 view; + uniform mat4 projection; + void main() { + gl_Position = projection * view * model * vec4(aPos, 1.0); + fragPos = vec3(model * vec4(aPos, 1.0)); + gl_PointSize = 3.0; // Set the size_ of the points + } + )"; + + + const string fragmentShaderSource_ = R"( +#version 300 es + +#define RENDER_MODE_WIREFRAME 1 +#define RENDER_MODE_POINTCLOUD 2 + +#define AMBIENT_COLOR vec3(0.95, 0.95, 0.95) +#define DIFFUSE_COLOR vec3(0.8, 0.8, 0.8) +#define SPECULAR_COLOR vec3(0.7, 0.7, 0.7) + +// Control defines for effects +#define ENABLE_HDR true +#define HDR_EXPOSURE 1.0 + +#define ENABLE_BLOOM true +#define BLOOM_INTENSITY 1.0 + +#define ENABLE_SHADOWS true + +precision highp float; + +in vec3 fragPos; +out vec4 fragColor; + +uniform vec3 lightPos; +uniform vec3 viewPos; +uniform int renderMode; + +// Function to check ray-sphere intersection +bool intersectSphere(vec3 rayOrigin, vec3 rayDir, vec3 sphereCenter, float sphereRadius) { + vec3 oc = rayOrigin - sphereCenter; + float a = dot(rayDir, rayDir); + float b = 2.0 * dot(oc, rayDir); + float c = dot(oc, oc) - sphereRadius * sphereRadius; + float discriminant = b * b - 4.0 * a * c; + return (discriminant > 0.0); +} + +// Function to check if a point is in shadow +bool isInShadow(vec3 fragPos, vec3 lightDir) { + // Use ray tracing to check for shadows (sphere example) + vec3 rayOrigin = fragPos + 0.001 * normalize(lightDir); // Slightly offset to avoid self-intersection + vec3 sphereCenter = vec3(0.0, 1.0, 0.0); // Example sphere center + float sphereRadius = 0.5; // Example sphere radius + + if (intersectSphere(rayOrigin, lightDir, sphereCenter, sphereRadius)) { + return true; // Point is in shadow + } + + return false; // Point is illuminated +} + +// HDR tone mapping function +vec3 toneMap(vec3 color, float exposure) { + return 1.0 - exp(-color * exposure); +} + +void main() { + vec4 attuned; + if (renderMode == RENDER_MODE_WIREFRAME) { + attuned = vec4(1.0, 0.0, 0.0, 1.0); + } else if (renderMode == RENDER_MODE_POINTCLOUD) { + float distance = length(fragPos - viewPos); + float attenuation = pow(1.0 / distance, 16.0); + vec3 color = vec3(1.0, 1.0, 1.0); + attuned = vec4(color, attenuation); + } else { + attuned = vec4(0.8, 0.8, 0.8, 1.0); + } + + vec3 ambient = 0.7 * attuned.xyz * AMBIENT_COLOR; + vec3 lightDir = normalize(lightPos - fragPos); + + // Check if the point is in shadow + #ifdef ENABLE_SHADOWS + if (isInShadow(fragPos, lightDir)) { + fragColor = vec4(ambient, 1.0); // Point is in shadow + return; + } + #endif + + float diff = max(dot(normalize(fragPos), lightDir), 0.0); + vec3 diffuse = diff * attuned.xyz * DIFFUSE_COLOR; + vec3 viewDir = normalize(viewPos - fragPos); + vec3 reflectDir = reflect(-lightDir, normalize(fragPos)); + float spec = pow(max(dot(viewDir, reflectDir), 0.0), 32.0); + vec3 specular = spec * SPECULAR_COLOR; + + // Combine ambient, diffuse, and specular components + vec3 finalColor = ambient + diffuse + specular; + + // Apply HDR tone mapping + #ifdef ENABLE_HDR + finalColor = toneMap(finalColor, HDR_EXPOSURE); + #endif + + // Bloom effect + #ifdef ENABLE_BLOOM + vec3 brightColor = finalColor - ambient; + finalColor += BLOOM_INTENSITY * brightColor; + #endif + + fragColor = vec4(finalColor, 1.0); +} + +)"; +public: + Scene(); + virtual ~Scene(); + void reset(); + bool load(const std::vector& points); + bool load(const std::string& filename); + void render(const cv::Rect& viewport, const cv::Matx44f& projection, const cv::Matx44f& view, const cv::Matx44f& modelView); + cv::Mat_ pointCloudAsMat(); + std::vector pointCloudAsVector(); + + float autoScale() { + return autoScale_; + } + + cv::Vec3f autoCenter() { + return autoCenter_; + } + + void setMode(RenderMode mode) { + mode_ = mode; + } + + cv::Vec3f lightPosition() { + return lightPos_; + } + + void setLightPosition(cv::Vec3f pos) { + lightPos_ = pos; + } + +}; + +} /* namespace gl */ +} /* namespace v4d */ +} /* namespace cv */ + +#endif /* MODULES_V4D_SRC_SCENE_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/sink.hpp b/modules/v4d/include/opencv2/v4d/sink.hpp new file mode 100644 index 000000000..14396d1f8 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/sink.hpp @@ -0,0 +1,61 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_V4D_SINK_HPP_ +#define SRC_OPENCV_V4D_SINK_HPP_ + +#include +#include +#include +#include +#include + +namespace cv { +namespace v4d { + +/*! + * A Sink object represents a way to write data produced by V4D (e.g. a video-file). + */ +class CV_EXPORTS Sink { + std::mutex mtx_; + bool open_ = true; + uint64_t nextSeq_ = 0; + std::map buffer_; + std::function consumer_; +public: + /*! + * Constructs the Sink object from a consumer functor. + * @param consumer A function object that consumes a UMat frame (e.g. writes it to a video file). + */ + CV_EXPORTS Sink(std::function consumer); + /*! + * Constucts a null Sink that is never open or ready + */ + CV_EXPORTS Sink(); + /*! + * Default destructor + */ + CV_EXPORTS virtual ~Sink(); + /*! + * Signals if the sink is ready to consume data. + * @return true if the sink is ready. + */ + CV_EXPORTS bool isReady(); + /*! + * Determines if the sink is open. + * @return true if the sink is open. + */ + CV_EXPORTS bool isOpen(); + /*! + * The sink operator. It accepts a UMat frame to pass to the consumer + * @param frame The frame to pass to the consumer. (e.g. VideoWriter) + */ + CV_EXPORTS void operator()(const uint64_t& seq, const cv::UMat& frame); +}; + +} /* namespace v4d */ +} /* namespace kb */ + +#endif /* SRC_OPENCV_V4D_SINK_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/source.hpp b/modules/v4d/include/opencv2/v4d/source.hpp new file mode 100644 index 000000000..69d3bde10 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/source.hpp @@ -0,0 +1,74 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_V4D_SOURCE_HPP_ +#define SRC_OPENCV_V4D_SOURCE_HPP_ + +#include +#include +#include +#include + +namespace cv { +namespace v4d { + +/*! + * A Source object represents a way to provide data to V4D by using + * a generator functor. + */ +class CV_EXPORTS Source { + bool open_ = true; + std::function generator_; + uint64_t count_ = 0; + float fps_; + bool threadSafe_ = false; + std::mutex mtx_; +public: + /*! + * Constructs the Source object from a generator functor. + * @param generator A function object that accepts a reference to a UMat frame + * that it manipulates. This is ultimatively used to provide video data to #cv::viz::V4D + * @param fps The fps the Source object provides data with. + */ + CV_EXPORTS Source(std::function generator, float fps); + /*! + * Constructs a null Source that is never open or ready. + */ + CV_EXPORTS Source(); + /*! + * Default destructor. + */ + CV_EXPORTS virtual ~Source(); + /*! + * Signals if the source is ready to provide data. + * @return true if the source is ready. + */ + CV_EXPORTS bool isReady(); + CV_EXPORTS bool isThreadSafe(); + CV_EXPORTS void setThreadSafe(bool ts); + + + /*! + * Determines if the source is open. + * @return true if the source is open. + */ + CV_EXPORTS bool isOpen(); + /*! + * Returns the fps the underlying generator provides data with. + * @return The fps of the Source object. + */ + CV_EXPORTS float fps(); + /*! + * The source operator. It returns the frame count and the frame generated + * (e.g. by VideoCapture)in a pair. + * @return A pair containing the frame count and the frame generated. + */ + CV_EXPORTS std::pair operator()(); +}; + +} /* namespace v4d */ +} /* namespace kb */ + +#endif /* SRC_OPENCV_V4D_SOURCE_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/threadsafemap.hpp b/modules/v4d/include/opencv2/v4d/threadsafemap.hpp new file mode 100644 index 000000000..d02a5cdb0 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/threadsafemap.hpp @@ -0,0 +1,124 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + + +#ifndef MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_THREADSAFEMAP_HPP_ +#define MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_THREADSAFEMAP_HPP_ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +namespace cv { +namespace v4d { + +// A concept to check if a type is hashable +template +concept Hashable = requires(T a) { + { std::hash{}(a) } -> std::convertible_to; +}; + +// A concept to check if a type can be stored in an std::unordered_map as value +template +concept Mappable = requires(T a) { + { std::any_cast(std::any{}) } -> std::same_as; +}; + +// A class that can set and get values in a thread-safe manner (per-key locking) +template +class ThreadSafeMap { +private: + // A map from keys to values + std::unordered_map map; + + // A map from keys to mutexes + std::unordered_map mutexes; + + // A mutex to lock the map + std::shared_mutex map_mutex; + +public: + // A method to set a value for a given key + template + void set(K key, V value) { + // Lock the map mutex for writing + std::unique_lock map_lock(map_mutex); + + // Check if the key exists in the map + if (map.find(key) == map.end()) { + // If the key does not exist, insert it into the map and the mutexes + map[key] = value; + mutexes[key]; + } else { + // If the key exists, lock the mutex for the key for writing + std::unique_lock key_lock(mutexes[key]); + + // Set the value for the key + map[key] = value; + } + } + + // A method to get a value for a given key + template + V get(K key) { + // Lock the map mutex for reading + std::shared_lock map_lock(map_mutex); + + // Check if the key exists in the map + if (map.find(key) == map.end()) { + CV_Error(Error::StsError, "Key not found in map"); + } + + // Lock the mutex for the key for reading + std::shared_lock key_lock(mutexes[key]); + + // Get the value for the key + return std::any_cast(map[key]); + } + + template void on(K key, F func) { + // Lock the map mutex for reading + std::shared_lock map_lock(map_mutex); + + // Check if the key exists in the map + if (map.find(key) == map.end()) { + CV_Error(Error::StsError, "Key not found in map"); + } + + // Lock the mutex for the key for writing + std::unique_lock key_lock(mutexes[key]); + + // Get the value for the key + std::any value = map[key]; + + // Apply the functor to the value + func(value); + + // Set the value for the key + map[key] = value; + } + + // A method to get a pointer to the value for a given key + // Note: This function is not thread-safe + template + V* ptr(K key) { + return std::any_cast(&map[key]); + } +}; + +} +} + + + +#endif /* MODULES_V4D_INCLUDE_OPENCV2_V4D_DETAIL_THREADSAFEMAP_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/util.hpp b/modules/v4d/include/opencv2/v4d/util.hpp new file mode 100644 index 000000000..fd2a7b979 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/util.hpp @@ -0,0 +1,531 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_V4D_UTIL_HPP_ +#define SRC_OPENCV_V4D_UTIL_HPP_ + +#include "source.hpp" +#include "sink.hpp" +#include +#include +#include +#ifdef __GNUG__ +#include +#include +#include +#endif +#include +#include + +#include +#include +#include +#include +#include +#include + +namespace cv { +namespace v4d { +namespace detail { + +using std::cout; +using std::endl; + +inline uint64_t get_epoch_nanos() { + return std::chrono::duration_cast(std::chrono::system_clock::now().time_since_epoch()).count(); +} + +static thread_local std::mutex mtx_; + +class CV_EXPORTS ThreadLocal { +public: + CV_EXPORTS static std::mutex& mutex() { + return mtx_; + } +}; + +class CV_EXPORTS Global { + inline static std::mutex global_mtx_; + + inline static std::mutex frame_cnt_mtx_; + inline static uint64_t frame_cnt_ = 0; + + inline static std::mutex start_time_mtx_; + inline static uint64_t start_time_ = get_epoch_nanos(); + + inline static std::mutex fps_mtx_; + inline static double fps_ = 0; + + inline static std::mutex thread_id_mtx_; + inline static const std::thread::id default_thread_id_; + inline static std::thread::id main_thread_id_; + inline static thread_local bool is_main_; + + inline static uint64_t run_cnt_ = 0; + inline static bool first_run_ = true; + + inline static size_t workers_ready_ = 0; + inline static size_t workers_started_ = 0; + inline static size_t next_worker_idx_ = 0; + inline static std::mutex sharedMtx_; + inline static std::map shared_; + typedef typename std::map::iterator Iterator; +public: + template + class Scope { + private: + const T& t_; + +// ocl::OpenCLExecutionContext* pSavedExecCtx_ = nullptr; +// ocl::OpenCLExecutionContext* pExecCtx_ = nullptr; +// +// template void bind(const Tunused& t) { +// //do nothing for all other types the UMat +// CV_UNUSED(t); +// } +// +// void bind(const cv::UMat& t) { +//#ifdef HAVE_OPENCL +// if(ocl::useOpenCL()) { +// pExecCtx_ = (t.u && t.u->allocatorContext) ? static_cast(t.u->allocatorContext.get()) : nullptr; +// if(pExecCtx_ && !pExecCtx_->empty()) { +// pSavedExecCtx_ = &ocl::OpenCLExecutionContext::getCurrentRef(); +// pExecCtx_->bind(); +// } else { +// pSavedExecCtx_ = nullptr; +// } +// } +//#endif +// } +// +// template void unbind(const Tunused& t) { +// //do nothing for all other types the UMat +// CV_UNUSED(t); +// } +// +// void unbind(const cv::UMat& t) { +// CV_UNUSED(t); +//#ifdef HAVE_OPENCL +// if(ocl::useOpenCL() && pSavedExecCtx_ && !pSavedExecCtx_->empty()) { +// pSavedExecCtx_->bind(); +// } +//#endif +// } + +public: + + Scope(const T& t) : t_(t) { + lock(t_); +// bind(t_); + } + + ~Scope() { + unlock(t_); +// unbind(t_); + } + }; + + CV_EXPORTS static std::mutex& mutex() { + return global_mtx_; + } + + CV_EXPORTS static uint64_t next_frame_cnt() { + std::unique_lock lock(frame_cnt_mtx_); + return frame_cnt_++; + } + + CV_EXPORTS static uint64_t frame_cnt() { + std::unique_lock lock(frame_cnt_mtx_); + return frame_cnt_; + } + + CV_EXPORTS static void mul_frame_cnt(const double& factor) { + std::unique_lock lock(frame_cnt_mtx_); + frame_cnt_ *= factor; + } + + CV_EXPORTS static void add_to_start_time(const size_t& st) { + std::unique_lock lock(start_time_mtx_); + start_time_ += st; + } + + CV_EXPORTS static uint64_t start_time() { + std::unique_lock lock(start_time_mtx_); + return start_time_; + } + + CV_EXPORTS static double fps() { + std::unique_lock lock(fps_mtx_); + return fps_; + } + + CV_EXPORTS static void set_fps(const double& f) { + std::unique_lock lock(fps_mtx_); + fps_ = f; + } + + CV_EXPORTS static void set_main_id(const std::thread::id& id) { + std::unique_lock lock(thread_id_mtx_); + main_thread_id_ = id; + } + + CV_EXPORTS static const bool is_main() { + std::unique_lock lock(start_time_mtx_); + return (main_thread_id_ == default_thread_id_ || main_thread_id_ == std::this_thread::get_id()); + } + + CV_EXPORTS static bool is_first_run() { + static std::mutex mtx; + std::unique_lock lock(mtx); + bool f = first_run_; + first_run_ = false; + return f; + } + + CV_EXPORTS static uint64_t next_run_cnt() { + static std::mutex mtx; + std::unique_lock lock(mtx); + return run_cnt_++; + } + + CV_EXPORTS static void set_workers_started(const size_t& ws) { + static std::mutex mtx; + std::unique_lock lock(mtx); + workers_started_ = ws; + } + + CV_EXPORTS static size_t workers_started() { + static std::mutex mtx; + std::unique_lock lock(mtx); + return workers_started_; + } + + CV_EXPORTS static size_t next_worker_ready() { + static std::mutex mtx; + std::unique_lock lock(mtx); + return ++workers_ready_; + } + + CV_EXPORTS static size_t next_worker_idx() { + static std::mutex mtx; + std::unique_lock lock(mtx); + return next_worker_idx_++; + } + + template + static bool isShared(const T& shared) { + std::lock_guard guard(sharedMtx_); + std::cerr << "shared:" << reinterpret_cast(&shared) << std::endl; + return shared_.find(reinterpret_cast(&shared)) != shared_.end(); + } + + template + static void registerShared(const T& shared) { + std::lock_guard guard(sharedMtx_); + std::cerr << "register:" << reinterpret_cast(&shared) << std::endl; + shared_.insert(std::make_pair(reinterpret_cast(&shared), new std::mutex())); + } + + template + static void lock(const T& shared) { + Iterator it, end; + std::mutex* mtx = nullptr; + { + std::lock_guard guard(sharedMtx_); + it = shared_.find(reinterpret_cast(&shared)); + end = shared_.end(); + if(it != end) { + mtx = (*it).second; + } + } + + if(mtx != nullptr) { + mtx->lock(); + return; + } + CV_Assert(!"You are trying to lock a non-shared variable"); + } + + template + static void unlock(const T& shared) { + Iterator it, end; + std::mutex* mtx = nullptr; + { + std::lock_guard guard(sharedMtx_); + it = shared_.find(reinterpret_cast(&shared)); + end = shared_.end(); + if(it != end) { + mtx = (*it).second; + } + } + + if(mtx != nullptr) { + mtx->unlock(); + return; + } + + CV_Assert(!"You are trying to unlock a non-shared variable"); + } + + template + static T safe_copy(const T& shared) { + std::lock_guard guard(sharedMtx_); + auto it = shared_.find(reinterpret_cast(&shared)); + + if(it != shared_.end()) { + std::lock_guard guard(*(*it).second); + return shared; + } else { + CV_Assert(!"You are unnecessarily safe copying a variable"); + //unreachable + return shared; + } + } + + static cv::UMat safe_copy(const cv::UMat& shared) { + std::lock_guard guard(sharedMtx_); + cv::UMat copy; + auto it = shared_.find(reinterpret_cast(&shared)); + if(it != shared_.end()) { + std::lock_guard guard(*(*it).second); + //workaround for context conflicts + shared.getMat(cv::ACCESS_READ).copyTo(copy); + return copy; + } else { + CV_Assert(!"You are unnecessarily safe copying a variable"); + //unreachable + shared.getMat(cv::ACCESS_READ).copyTo(copy); + return copy; + } + } +}; + +//https://stackoverflow.com/a/27885283/1884837 +template +struct function_traits : function_traits { +}; + +// partial specialization for function type +template +struct function_traits { + using result_type = R; + using argument_types = std::tuple...>; +}; + +// partial specialization for function pointer +template +struct function_traits { + using result_type = R; + using argument_types = std::tuple...>; +}; + +// partial specialization for std::function +template +struct function_traits> { + using result_type = R; + using argument_types = std::tuple...>; +}; + +// partial specialization for pointer-to-member-function (i.e., operator()'s) +template +struct function_traits { + using result_type = R; + using argument_types = std::tuple...>; +}; + +template +struct function_traits { + using result_type = R; + using argument_types = std::tuple...>; +}; + + +//https://stackoverflow.com/questions/281818/unmangling-the-result-of-stdtype-infoname +CV_EXPORTS std::string demangle(const char* name); + +template +struct fun_ptr_helper +{ +public: + typedef std::function<_Res(_ArgTypes...)> function_type; + + static void bind(function_type&& f) + { instance().fn_.swap(f); } + + static void bind(const function_type& f) + { instance().fn_=f; } + + static _Res invoke(_ArgTypes... args) + { return instance().fn_(args...); } + + typedef decltype(&fun_ptr_helper::invoke) pointer_type; + static pointer_type ptr() + { return &invoke; } + +private: + static fun_ptr_helper& instance() + { + static fun_ptr_helper inst_; + return inst_; + } + + fun_ptr_helper() {} + + function_type fn_; +}; + +template +typename fun_ptr_helper<_UniqueId, _Res, _ArgTypes...>::pointer_type +get_fn_ptr(const std::function<_Res(_ArgTypes...)>& f) +{ + fun_ptr_helper<_UniqueId, _Res, _ArgTypes...>::bind(f); + return fun_ptr_helper<_UniqueId, _Res, _ArgTypes...>::ptr(); +} + +template +std::function::value, T>::type> +make_function(T *t) +{ + return {t}; +} + +//https://stackoverflow.com/a/33047781/1884837 +struct Lambda { + template + static Tret lambda_ptr_exec() { + return (Tret) (*(T*)fn()); + } + + template + static Tfp ptr(T& t) { + fn(&t); + return (Tfp) lambda_ptr_exec; + } + + template + static const void* fn(const void* new_fn = nullptr) { + CV_Assert(new_fn); + return new_fn; + } +}; + +CV_EXPORTS size_t cnz(const cv::UMat& m); +} +using std::string; +class V4D; + + + +CV_EXPORTS void copy_shared(const cv::UMat& src, cv::UMat& dst); + +/*! + * Convenience function to color convert from Scalar to Scalar + * @param src The scalar to color convert + * @param code The color converions code + * @return The color converted scalar + */ +CV_EXPORTS cv::Scalar colorConvert(const cv::Scalar& src, cv::ColorConversionCodes code); + +/*! + * Convenience function to check for OpenGL errors. Should only be used via the macro #GL_CHECK. + * @param file The file path of the error. + * @param line The file line of the error. + * @param expression The expression that failed. + */ +CV_EXPORTS void gl_check_error(const std::filesystem::path& file, unsigned int line, const char* expression); +/*! + * Convenience macro to check for OpenGL errors. + */ +#ifndef NDEBUG +#define GL_CHECK(expr) \ + expr; \ + cv::v4d::gl_check_error(__FILE__, __LINE__, #expr); +#else +#define GL_CHECK(expr) \ + expr; +#endif +CV_EXPORTS void initShader(unsigned int handles[3], const char* vShader, const char* fShader, const char* outputAttributeName); + +/*! + * Returns the OpenGL vendor string + * @return a string object with the OpenGL vendor information + */ +CV_EXPORTS std::string getGlVendor(); +/*! + * Returns the OpenGL Version information. + * @return a string object with the OpenGL version information + */ +CV_EXPORTS std::string getGlInfo(); +/*! + * Returns the OpenCL Version information. + * @return a string object with the OpenCL version information + */ +CV_EXPORTS std::string getClInfo(); +/*! + * Determines if Intel VAAPI is supported + * @return true if it is supported + */ +CV_EXPORTS bool isIntelVaSupported(); +/*! + * Determines if cl_khr_gl_sharing is supported + * @return true if it is supported + */ +CV_EXPORTS bool isClGlSharingSupported(); +/*! + * Tells the application if it's alright to keep on running. + * Note: If you use this mechanism signal handlers are installed + * @return true if the program should keep on running + */ +CV_EXPORTS bool keepRunning(); + +CV_EXPORTS void requestFinish(); + +/*! + * Creates an Intel VAAPI enabled VideoWriter sink object to use in conjunction with #V4D::setSink(). + * Usually you would call #makeWriterSink() and let it automatically decide if VAAPI is available. + * @param outputFilename The filename to write the video to. + * @param fourcc The fourcc code of the codec to use. + * @param fps The fps of the target video. + * @param frameSize The frame size of the target video. + * @param vaDeviceIndex The VAAPI device index to use. + * @return A VAAPI enabled sink object. + */ +CV_EXPORTS cv::Ptr makeVaSink(cv::Ptr window, const string& outputFilename, const int fourcc, const float fps, + const cv::Size& frameSize, const int vaDeviceIndex); +/*! + * Creates an Intel VAAPI enabled VideoCapture source object to use in conjunction with #V4D::setSource(). + * Usually you would call #makeCaptureSource() and let it automatically decide if VAAPI is available. + * @param inputFilename The file to read from. + * @param vaDeviceIndex The VAAPI device index to use. + * @return A VAAPI enabled source object. + */ +CV_EXPORTS cv::Ptr makeVaSource(cv::Ptr window, const string& inputFilename, const int vaDeviceIndex); +/*! + * Creates a VideoWriter sink object to use in conjunction with #V4D::setSink(). + * This function automatically determines if Intel VAAPI is available and enables it if so. + * @param outputFilename The filename to write the video to. + * @param fourcc The fourcc code of the codec to use. + * @param fps The fps of the target video. + * @param frameSize The frame size of the target video. + * @return A (optionally VAAPI enabled) VideoWriter sink object. + */ +CV_EXPORTS cv::Ptr makeWriterSink(cv::Ptr window, const string& outputFilename, const float fps, + const cv::Size& frameSize); +CV_EXPORTS cv::Ptr makeWriterSink(cv::Ptr window, const string& outputFilename, const float fps, + const cv::Size& frameSize, const int fourcc); +/*! + * Creates a VideoCapture source object to use in conjunction with #V4D::setSource(). + * This function automatically determines if Intel VAAPI is available and enables it if so. + * @param inputFilename The file to read from. + * @return A (optionally VAAPI enabled) VideoCapture enabled source object. + */ +CV_EXPORTS cv::Ptr makeCaptureSource(cv::Ptr window, const string& inputFilename); + +void resizePreserveAspectRatio(const cv::UMat& src, cv::UMat& output, const cv::Size& dstSize, const cv::Scalar& bgcolor = {0,0,0,255}); + +} +} + +#endif /* SRC_OPENCV_V4D_UTIL_HPP_ */ diff --git a/modules/v4d/include/opencv2/v4d/v4d.hpp b/modules/v4d/include/opencv2/v4d/v4d.hpp new file mode 100644 index 000000000..6f8d91132 --- /dev/null +++ b/modules/v4d/include/opencv2/v4d/v4d.hpp @@ -0,0 +1,872 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#ifndef SRC_OPENCV_V4D_V4D_HPP_ +#define SRC_OPENCV_V4D_V4D_HPP_ + +#include "source.hpp" +#include "sink.hpp" +#include "util.hpp" +#include "nvg.hpp" +#include "threadsafemap.hpp" +#include "detail/transaction.hpp" +#include "detail/framebuffercontext.hpp" +#include "detail/nanovgcontext.hpp" +#include "detail/imguicontext.hpp" +#include "detail/timetracker.hpp" +#include "detail/glcontext.hpp" +#include "detail/sourcecontext.hpp" +#include "detail/sinkcontext.hpp" +#include "detail/resequence.hpp" +#include "events.hpp" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + + + +using std::cout; +using std::cerr; +using std::endl; +using std::string; +using namespace std::chrono_literals; + + +/*! + * OpenCV namespace + */ +namespace cv { +/*! + * V4D namespace + */ +namespace v4d { + +enum AllocateFlags { + NONE = 0, + NANOVG = 1, + IMGUI = 2, + ALL = NANOVG | IMGUI +}; + +class Plan { + const cv::Size sz_; + const cv::Rect vp_; +public: + + //predefined branch predicates + constexpr static auto always_ = []() { return true; }; + constexpr static auto isTrue_ = [](const bool& b) { return b; }; + constexpr static auto isFalse_ = [](const bool& b) { return !b; }; + constexpr static auto and_ = [](const bool& a, const bool& b) { return a && b; }; + constexpr static auto or_ = [](const bool& a, const bool& b) { return a || b; }; + + explicit Plan(const cv::Rect& vp) : sz_(cv::Size(vp.width, vp.height)), vp_(vp){}; + explicit Plan(const cv::Size& sz) : sz_(sz), vp_(0, 0, sz.width, sz.height){}; + virtual ~Plan() {}; + + virtual void gui(cv::Ptr window) { CV_UNUSED(window); }; + virtual void setup(cv::Ptr window) { CV_UNUSED(window); }; + virtual void infer(cv::Ptr window) = 0; + virtual void teardown(cv::Ptr window) { CV_UNUSED(window); }; + + const cv::Size& size() { + return sz_; + } + const cv::Rect& viewport() { + return vp_; + } +}; +/*! + * Private namespace + */ +namespace detail { + +template using static_not = std::integral_constant; + +template +struct is_function +{ + static const bool value = std::is_constructible>::value; +}; + +//https://stackoverflow.com/a/34873353/1884837 +template +struct is_stateless_lambda : std::integral_constant{}; + +template std::string int_to_hex( T i ) +{ + std::stringstream stream; + stream << "0x" + << std::setfill ('0') << std::setw(sizeof(T) * 2) + << std::hex << i; + return stream.str(); +} + +template std::string lambda_ptr_hex(Tlamba&& l) { + return int_to_hex((size_t)Lambda::ptr(l)); +} + +static std::size_t index(const std::thread::id id) +{ + static std::size_t nextindex = 0; + static std::mutex my_mutex; + static std::unordered_map ids; + std::lock_guard lock(my_mutex); + auto iter = ids.find(id); + if(iter == ids.end()) + return ids[id] = nextindex++; + return iter->second; +} + +template +const string make_id(const string& name, Tfn&& fn, Args&& ... args) { + stringstream ss; + ss << name << "(" << index(std::this_thread::get_id()) << "-" << detail::lambda_ptr_hex(std::forward(fn)) << ")"; + ((ss << ',' << int_to_hex((long)&args)), ...); + return ss.str(); +} + +} + + +using namespace cv::v4d::detail; + +class CV_EXPORTS V4D { + friend class detail::FrameBufferContext; + friend class HTML5Capture; + int32_t workerIdx_ = -1; + cv::Ptr self_; + cv::Ptr plan_; + const cv::Size initialSize_; + AllocateFlags flags_; + bool debug_; + cv::Rect viewport_; + bool stretching_; + int samples_; + bool focused_ = false; + cv::Ptr mainFbContext_ = nullptr; + cv::Ptr sourceContext_ = nullptr; + cv::Ptr sinkContext_ = nullptr; + cv::Ptr nvgContext_ = nullptr; + cv::Ptr imguiContext_ = nullptr; + cv::Ptr onceContext_ = nullptr; + cv::Ptr plainContext_ = nullptr; + std::mutex glCtxMtx_; + std::map> glContexts_; + bool closed_ = false; + cv::Ptr source_; + cv::Ptr sink_; + cv::UMat captureFrame_; + cv::UMat writerFrame_; + std::function keyEventCb_; + std::function mouseEventCb_; + cv::Point2f mousePos_; + uint64_t frameCnt_ = 0; + bool showFPS_ = true; + bool printFPS_ = false; + bool showTracking_ = true; + std::vector> accesses_; + std::map> transactions_; + bool disableIO_ = false; +public: + /*! + * Creates a V4D object which is the central object to perform visualizations with. + * @param initialSize The initial size of the heavy-weight window. + * @param frameBufferSize The initial size of the framebuffer backing the window (needs to be equal or greate then initial size). + * @param offscreen Don't create a window and rather render offscreen. + * @param title The window title. + * @param major The OpenGL major version to request. + * @param minor The OpenGL minor version to request. + * @param compat Request a compatibility context. + * @param samples MSAA samples. + * @param debug Create a debug OpenGL context. + */ + CV_EXPORTS static cv::Ptr make(const cv::Size& size, const string& title, AllocateFlags flags = ALL, bool offscreen = false, bool debug = false, int samples = 0); + CV_EXPORTS static cv::Ptr make(const cv::Size& size, const cv::Size& fbsize, const string& title, AllocateFlags flags = ALL, bool offscreen = false, bool debug = false, int samples = 0); + CV_EXPORTS static cv::Ptr make(const V4D& v4d, const string& title); + /*! + * Default destructor + */ + CV_EXPORTS virtual ~V4D(); + + CV_EXPORTS const int32_t& workerIndex() const; + CV_EXPORTS size_t workers_running(); + /*! + * The internal framebuffer exposed as OpenGL Texture2D. + * @return The texture object. + */ + CV_EXPORTS cv::ogl::Texture2D& texture(); + CV_EXPORTS std::string title() const; + + struct Node { + string name_; + std::set read_deps_; + std::set write_deps_; + cv::Ptr tx_ = nullptr; + bool initialized() { + return tx_; + } + }; + + std::vector> nodes_; + + void findNode(const string& name, cv::Ptr& found) { + CV_Assert(!name.empty()); + if(nodes_.empty()) + return; + + if(nodes_.back()->name_ == name) + found = nodes_.back(); + + } + + void makeGraph() { +// cout << std::this_thread::get_id() << " ### MAKE PLAN ### " << endl; + for(const auto& t : accesses_) { + const string& name = std::get<0>(t); + const bool& read = std::get<1>(t); + const long& dep = std::get<2>(t); + cv::Ptr n; + findNode(name, n); + + if(!n) { + n = new Node(); + n->name_ = name; + n->tx_ = transactions_[name]; + CV_Assert(!n->name_.empty()); + CV_Assert(n->tx_); + nodes_.push_back(n); +// cout << "make: " << std::this_thread::get_id() << " " << n->name_ << endl; + } + + + if(read) { + n->read_deps_.insert(dep); + } else { + n->write_deps_.insert(dep); + } + } + } + + void runGraph() { + bool isEnabled = true; + + for (auto& n : nodes_) { + if (n->tx_->isPredicate()) { + isEnabled = n->tx_->enabled(); + } else if (isEnabled) { + if(n->tx_->lock()) { + std::lock_guard guard(Global::mutex()); + n->tx_->getContext()->execute([n]() { + TimeTracker::getInstance()->execute(n->name_, [n](){ + n->tx_->perform(); + }); + }); + } else { + n->tx_->getContext()->execute([n]() { + TimeTracker::getInstance()->execute(n->name_, [n](){ + n->tx_->perform(); + }); + }); + } + } + } + } + + void clearGraph() { + nodes_.clear(); + accesses_.clear(); + } + + template + typename std::enable_if::value, void>::type + emit_access(const string& context, bool read, const T* tp) { + //disabled + } + + template + typename std::enable_if::value, void>::type + emit_access(const string& context, bool read, const T* tp) { +// cout << "access: " << std::this_thread::get_id() << " " << context << string(read ? " <- " : " -> ") << demangle(typeid(std::remove_const_t).name()) << "(" << (long)tp << ") " << endl; + accesses_.push_back(std::make_tuple(context, read, (long)tp)); + } + + template + void add_transaction(bool lock, cv::Ptr ctx, const string& invocation, Tfn fn, Args&& ...args) { + auto it = transactions_.find(invocation); + if(it == transactions_.end()) { + auto tx = make_transaction(lock, fn, std::forward(args)...); + tx->setContext(ctx); + transactions_.insert({invocation, tx}); + } + } + + template + void init_context_call(Tfn fn, Args&& ... args) { + static_assert(detail::is_stateless_lambda>>::value, "All passed functors must be stateless lambdas"); + static_assert(std::conjunction...>::value, "All arguments must be l-value references"); + cv::v4d::event::set_current_glfw_window(getGLFWWindow()); + } + + + template + typename std::enable_if, void>::type + gl(Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + const string id = make_id("gl-1", fn, args...); + emit_access(id, true, &fbCtx()->fb()); + (emit_access, Args...>(id, std::is_const_v>, &args),...); + emit_access(id, false, &fbCtx()->fb()); + std::function functor(fn); + add_transaction(false, glCtx(-1), id, std::forward(fn), std::forward(args)...); + } + + template + void gl(int32_t idx, Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + const string id = make_id("gl" + std::to_string(idx), fn, args...); + emit_access(id, true, &fbCtx()->fb()); + (emit_access, Args...>(id, std::is_const_v>, &args),...); + emit_access(id, false, &fbCtx()->fb()); + std::function functor(fn); + add_transaction(false, glCtx(idx),id, std::forward(functor), glCtx(idx)->getIndex(), std::forward(args)...); + } + + template + void branch(Tfn fn) { + init_context_call(fn); + const string id = make_id("branch", fn); + std::function functor = fn; + emit_access(id, true, &fn); + add_transaction(true, plainCtx(), id, functor); + } + + template + void branch(Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + const string id = make_id("branch", fn, args...); + + (emit_access, Args...>(id, std::is_const_v>, &args),...); + std::function functor = fn; + add_transaction(true, plainCtx(), id, functor, std::forward(args)...); + } + + template + void branch(int workerIdx, Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + const string id = make_id("branch-pin" + std::to_string(workerIdx), fn, args...); + + (emit_access, Args...>(id, std::is_const_v>, &args),...); + std::function functor = fn; + std::function wrap = [this, workerIdx, functor](Args&& ... args){ + return this->workerIndex() == workerIdx && functor(args...); + }; + add_transaction(true, plainCtx(), id, wrap, std::forward(args)...); + } + + template + void endbranch(Tfn fn) { + init_context_call(fn); + const string id = make_id("endbranch", fn); + + std::function functor = fn; + emit_access(id, true, &fn); + add_transaction(true, plainCtx(), id, functor); + } + + template + void endbranch(Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + const string id = make_id("endbranch", fn, args...); + + (emit_access, Args...>(id, std::is_const_v>, &args),...); + std::function functor = [this](Args&& ... args){ + return true; + }; + add_transaction(true, plainCtx(), id, functor, std::forward(args)...); + } + + template + void endbranch(int workerIdx, Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + const string id = make_id("endbranch-pin" + std::to_string(workerIdx), fn, args...); + + (emit_access, Args...>(id, std::is_const_v>, &args),...); + std::function functor = [this, workerIdx](Args&& ... args){ + return this->workerIndex() == workerIdx; + }; + add_transaction(true, plainCtx(), id, functor, std::forward(args)...); + } + + template + void fb(Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + const string id = make_id("fb", fn, args...); + using Tfb = std::add_lvalue_reference_t::argument_types>::type>; + using Tfbbase = typename std::remove_cv::type; + + static_assert((std::is_same::value || std::is_same::value) || !"The first argument must be eiter of type 'cv::UMat&' or 'const cv::UMat&'"); + emit_access(id, true, &fbCtx()->fb()); + (emit_access, Tfb, Args...>(id, std::is_const_v>, &args),...); + emit_access::type>, cv::UMat, Tfb, Args...>(id, false, &fbCtx()->fb()); + std::function functor(fn); + add_transaction(false, fbCtx(),id, std::forward(functor), fbCtx()->fb(), std::forward(args)...); + } + + void capture() { + if(disableIO_) + return; + capture([](const cv::UMat& inputFrame, cv::UMat& f){ + if(!inputFrame.empty()) + inputFrame.copyTo(f); + }, captureFrame_); + + fb([](cv::UMat& frameBuffer, const cv::UMat& f) { + if(!f.empty()) + f.copyTo(frameBuffer); + }, captureFrame_); + } + + template + void capture(Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + + if(disableIO_) + return; + const string id = make_id("capture", fn, args...); + using Tfb = std::add_lvalue_reference_t::argument_types>::type>; + + static_assert((std::is_same::value) || !"The first argument must be of type 'const cv::UMat&'"); + emit_access(id, true, &sourceCtx()->sourceBuffer()); + (emit_access, Tfb, Args...>(id, std::is_const_v>, &args),...); + std::function functor(fn); + add_transaction(false, std::dynamic_pointer_cast(sourceCtx()),id, std::forward(functor), sourceCtx()->sourceBuffer(), std::forward(args)...); + } + + void write() { + if(disableIO_) + return; + + fb([](const cv::UMat& frameBuffer, cv::UMat& f) { + frameBuffer.copyTo(f); + }, writerFrame_); + + write([](cv::UMat& outputFrame, const cv::UMat& f){ + f.copyTo(outputFrame); + }, writerFrame_); + } + + template + void write(Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + + if(disableIO_) + return; + const string id = make_id("write", fn, args...); + using Tfb = std::add_lvalue_reference_t::argument_types>::type>; + + static_assert((std::is_same::value) || !"The first argument must be of type 'cv::UMat&'"); + emit_access(id, true, &sinkCtx()->sinkBuffer()); + (emit_access, Tfb, Args...>(id, std::is_const_v>, &args),...); + emit_access(id, false, &sinkCtx()->sinkBuffer()); + std::function functor(fn); + add_transaction(false, std::dynamic_pointer_cast(sinkCtx()),id, std::forward(functor), sinkCtx()->sinkBuffer(), std::forward(args)...); + } + + template + void nvg(Tfn fn, Args&&... args) { + init_context_call(fn, args...); + + const string id = make_id("nvg", fn, args...); + emit_access(id, true, &fbCtx()->fb()); + (emit_access, Args...>(id, std::is_const_v>, &args),...); + emit_access(id, false, &fbCtx()->fb()); + std::function functor(fn); + add_transaction(false, nvgCtx(), id, std::forward(fn), std::forward(args)...); + } + + template + void once(Tfn fn, Args&&... args) { + CV_Assert(detail::is_stateless_lambda>>::value); + const string id = make_id("once", fn, args...); + (emit_access, Args...>(id, std::is_const_v>, &args),...); + std::function functor(fn); + add_transaction(false, onceCtx(), id, std::forward(fn), std::forward(args)...); + } + + template + void plain(Tfn fn, Args&&... args) { + init_context_call(fn, args...); + + const string id = make_id("plain", fn, args...); + (emit_access, Args...>(id, std::is_const_v>, &args),...); + std::function functor(fn); + add_transaction(false, fbCtx(), id, std::forward(fn), std::forward(args)...); + } + + template + void imgui(Tfn fn, Args&& ... args) { + init_context_call(fn, args...); + + if(!hasImguiCtx()) + return; + + auto s = self(); + + imguiCtx()->build([s, fn, &args...](ImGuiContext* ctx) { + fn(s, ctx, args...); + }); + } + /*! + * Copy the framebuffer contents to an OutputArray. + * @param arr The array to copy to. + */ + CV_EXPORTS void copyTo(cv::UMat& arr); + /*! + * Copy the InputArray contents to the framebuffer. + * @param arr The array to copy. + */ + CV_EXPORTS void copyFrom(const cv::UMat& arr); + + template + void run(cv::Ptr plan, int32_t workers = -1) { + plan_ = std::static_pointer_cast(plan); + + static Resequence reseq; + //for now, if automatic determination of the number of workers is requested, + //set workers always to 2 + CV_Assert(workers > -2); + if(workers == -1) { + workers = 2; + } else { + ++workers; + } + + std::vector threads; + { + static std::mutex runMtx; + std::unique_lock lock(runMtx); + + cerr << "run plan: " << std::this_thread::get_id() << " workers: " << workers << endl; + + if(Global::is_first_run()) { + Global::set_main_id(std::this_thread::get_id()); + cerr << "Starting with " << workers - 1<< " extra workers" << endl; + cv::utils::logging::setLogLevel(cv::utils::logging::LOG_LEVEL_SILENT); + } + + if(workers > 1) { + cv::setNumThreads(0); + } + + if(Global::is_main()) { + cv::Size sz = this->initialSize(); + const string title = this->title(); + bool debug = this->debug_; + auto src = this->getSource(); + auto sink = this->getSink(); + Global::set_workers_started(workers); + std::vector> plans; + //make sure all Plans are constructed before starting the workers + for (size_t i = 0; i < workers; ++i) { + plans.push_back(new Tplan(plan->size())); + } + for (size_t i = 0; i < workers; ++i) { + threads.push_back( + new std::thread( + [this, i, src, sink, plans] { + cv::utils::logging::setLogLevel(cv::utils::logging::LOG_LEVEL_SILENT); + cv::Ptr worker = V4D::make(*this, this->title() + "-worker-" + std::to_string(i)); + if (src) { + worker->setSource(src); + } + if (sink) { + worker->setSink(sink); + } + cv::Ptr newPlan = plans[i]; + worker->run(newPlan, 0); + } + ) + ); + } + } + } + + CLExecScope_t scope(this->fbCtx()->getCLExecContext()); + this->fbCtx()->makeCurrent(); + + if(Global::is_main()) { + this->printSystemInfo(); + } else { + try { + plan->setup(self()); + this->makeGraph(); + this->runGraph(); + this->clearGraph(); + if(!Global::is_main() && Global::workers_started() == Global::next_worker_ready()) { + cv::utils::logging::setLogLevel(cv::utils::logging::LOG_LEVEL_INFO); + } + } catch(std::exception& ex) { + CV_Error_(cv::Error::StsError, ("pipeline setup failed: %s", ex.what())); + } + } + if(Global::is_main()) { + try { + plan->gui(self()); + } catch(std::exception& ex) { + CV_Error_(cv::Error::StsError, ("GUI setup failed: %s", ex.what())); + } + } else { + plan->infer(self()); + this->makeGraph(); + } + + try { + if(Global::is_main()) { + do { + //refresh-rate depends on swap interval (1) for sync + } while(keepRunning() && this->display()); + requestFinish(); + reseq.finish(); + } else { + cerr << "Starting pipeling with " << this->nodes_.size() << " nodes." << endl; + + static std::mutex seqMtx; + do { + reseq.notify(); + uint64_t seq; + { + std::unique_lock lock(seqMtx); + seq = Global::next_run_cnt(); + } + + this->runGraph(); + reseq.waitFor(seq); + } while(keepRunning() && this->display()); + } + } catch(std::exception& ex) { + requestFinish(); + reseq.finish(); + CV_LOG_WARNING(nullptr, "-> pipeline terminated: " << ex.what()); + } + + if(!Global::is_main()) { + this->clearGraph(); + + try { + plan->teardown(self()); + this->makeGraph(); + this->runGraph(); + this->clearGraph(); + } catch(std::exception& ex) { + CV_Error_(cv::Error::StsError, ("pipeline tear-down failed: %s", ex.what())); + } + } else { + for(auto& t : threads) + t->join(); + } + } +/*! + * Called to feed an image directly to the framebuffer + */ + void feed(cv::UMat& in); + /*! + * Fetches a copy of frambuffer + * @return a copy of the framebuffer + */ + CV_EXPORTS cv::UMat fetch(); + + /*! + * Set the current #cv::viz::Source object. Usually created using #makeCaptureSource(). + * @param src A #cv::viz::Source object. + */ + CV_EXPORTS void setSource(cv::Ptr src); + CV_EXPORTS cv::Ptr getSource(); + CV_EXPORTS bool hasSource(); + + /*! + * Set the current #cv::viz::Sink object. Usually created using #makeWriterSink(). + * @param sink A #cv::viz::Sink object. + */ + CV_EXPORTS void setSink(cv::Ptr sink); + CV_EXPORTS cv::Ptr getSink(); + CV_EXPORTS bool hasSink(); + /*! + * Get the window position. + * @return The window position. + */ + CV_EXPORTS cv::Vec2f position(); + /*! + * Get the current viewport reference. + * @return The current viewport reference. + */ + CV_EXPORTS cv::Rect& viewport(); + /*! + * Get the pixel ratio of the display x-axis. + * @return The pixel ratio of the display x-axis. + */ + CV_EXPORTS float pixelRatioX(); + /*! + * Get the pixel ratio of the display y-axis. + * @return The pixel ratio of the display y-axis. + */ + CV_EXPORTS float pixelRatioY(); + CV_EXPORTS const cv::Size& initialSize() const; + CV_EXPORTS const cv::Size& fbSize() const; + /*! + * Set the window size + * @param sz The future size of the window. + */ + CV_EXPORTS void setSize(const cv::Size& sz); + /*! + * Get the window size. + * @return The window size. + */ + CV_EXPORTS cv::Size size(); + /*! + * Get the frambuffer size. + * @return The framebuffer size. + */ + + CV_EXPORTS bool getShowFPS(); + CV_EXPORTS void setShowFPS(bool s); + CV_EXPORTS bool getPrintFPS(); + CV_EXPORTS void setPrintFPS(bool p); + CV_EXPORTS bool getShowTracking(); + CV_EXPORTS void setShowTracking(bool st); + CV_EXPORTS void setDisableIO(bool d); + + CV_EXPORTS bool isFullscreen(); + /*! + * Enable or disable fullscreen mode. + * @param f if true enable fullscreen mode else disable. + */ + CV_EXPORTS void setFullscreen(bool f); + /*! + * Determines if the window is resizeable. + * @return true if the window is resizeable. + */ + CV_EXPORTS bool isResizable(); + /*! + * Set the window resizable. + * @param r if r is true set the window resizable. + */ + CV_EXPORTS void setResizable(bool r); + /*! + * Determine if the window is visible. + * @return true if the window is visible. + */ + CV_EXPORTS bool isVisible(); + /*! + * Set the window visible or invisible. + * @param v if v is true set the window visible. + */ + CV_EXPORTS void setVisible(bool v); + /*! + * Enable/Disable scaling the framebuffer during blitting. + * @param s if true enable scaling. + */ + CV_EXPORTS void setStretching(bool s); + /*! + * Determine if framebuffer is scaled during blitting. + * @return true if framebuffer is scaled during blitting. + */ + CV_EXPORTS bool isStretching(); + /*! + * Determine if th V4D object is marked as focused. + * @return true if the V4D object is marked as focused. + */ + CV_EXPORTS bool isFocused(); + /*! + * Mark the V4D object as focused. + * @param s if true mark as focused. + */ + CV_EXPORTS void setFocused(bool f); + /*! + * Everytime a frame is displayed this count is incremented- + * @return the current frame count- + */ + CV_EXPORTS const uint64_t& frameCount() const; + /*! + * Determine if the window is closed. + * @return true if the window is closed. + */ + CV_EXPORTS bool isClosed(); + /*! + * Close the window. + */ + CV_EXPORTS void close(); + /*! + * Display the framebuffer in the native window by blitting. + * @return false if the window is closed. + */ + CV_EXPORTS bool display(); + /*! + * Print basic system information to stderr. + */ + CV_EXPORTS void printSystemInfo(); + + CV_EXPORTS GLFWwindow* getGLFWWindow() const; + + CV_EXPORTS cv::Ptr fbCtx() const; + CV_EXPORTS cv::Ptr sourceCtx(); + CV_EXPORTS cv::Ptr sinkCtx(); + CV_EXPORTS cv::Ptr nvgCtx(); + CV_EXPORTS cv::Ptr onceCtx(); + CV_EXPORTS cv::Ptr plainCtx(); + CV_EXPORTS cv::Ptr imguiCtx(); + CV_EXPORTS cv::Ptr glCtx(int32_t idx = 0); + + CV_EXPORTS bool hasFbCtx(); + CV_EXPORTS bool hasSourceCtx(); + CV_EXPORTS bool hasSinkCtx(); + CV_EXPORTS bool hasNvgCtx(); + CV_EXPORTS bool hasOnceCtx(); + CV_EXPORTS bool hasParallelCtx(); + CV_EXPORTS bool hasImguiCtx(); + CV_EXPORTS bool hasGlCtx(uint32_t idx = 0); + CV_EXPORTS size_t numGlCtx(); +private: + V4D(const V4D& v4d, const string& title); + V4D(const cv::Size& size, const cv::Size& fbsize, + const string& title, AllocateFlags flags, bool offscreen, bool debug, int samples); + + cv::Point2f getMousePosition(); + void setMousePosition(const cv::Point2f& pt); + + void swapContextBuffers(); +protected: + AllocateFlags flags(); + cv::Ptr self(); + void fence(); + bool wait(uint64_t timeout = 0); +}; +} +} /* namespace cv */ + +#endif /* SRC_OPENCV_V4D_V4D_HPP_ */ + diff --git a/modules/v4d/samples/beauty-demo.cpp b/modules/v4d/samples/beauty-demo.cpp new file mode 100644 index 000000000..1b0787d52 --- /dev/null +++ b/modules/v4d/samples/beauty-demo.cpp @@ -0,0 +1,399 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) +#include +#include +#include +#include +#include +#include + +#include +#include + +using std::vector; +using std::string; + +/*! + * Data structure holding the points for all face landmarks + */ +struct FaceFeatures { + cv::Rect faceRect_; + vector chin_; + vector top_nose_; + vector bottom_nose_; + vector left_eyebrow_; + vector right_eyebrow_; + vector left_eye_; + vector right_eye_; + vector outer_lips_; + vector inside_lips_; + FaceFeatures() {}; + FaceFeatures(const cv::Rect &faceRect, const vector &shape, double local_scale) { + //calculate the face rectangle + faceRect_ = cv::Rect(faceRect.x / local_scale, faceRect.y / local_scale, faceRect.width / local_scale, faceRect.height / local_scale); + + /** Copy all features **/ + size_t i = 0; + // Around Chin. Ear to Ear + for (i = 0; i <= 16; ++i) + chin_.push_back(shape[i] / local_scale); + // left eyebrow + for (; i <= 21; ++i) + left_eyebrow_.push_back(shape[i] / local_scale); + // Right eyebrow + for (; i <= 26; ++i) + right_eyebrow_.push_back(shape[i] / local_scale); + // Line on top of nose + for (; i <= 30; ++i) + top_nose_.push_back(shape[i] / local_scale); + // Bottom part of the nose + for (; i <= 35; ++i) + bottom_nose_.push_back(shape[i] / local_scale); + // Left eye + for (; i <= 41; ++i) + left_eye_.push_back(shape[i] / local_scale); + // Right eye + for (; i <= 47; ++i) + right_eye_.push_back(shape[i] / local_scale); + // Lips outer part + for (; i <= 59; ++i) + outer_lips_.push_back(shape[i] / local_scale); + // Lips inside part + for (; i <= 67; ++i) + inside_lips_.push_back(shape[i] / local_scale); + } + + //Concatenates all feature points + vector points() const { + vector allPoints; + allPoints.insert(allPoints.begin(), chin_.begin(), chin_.end()); + allPoints.insert(allPoints.begin(), top_nose_.begin(), top_nose_.end()); + allPoints.insert(allPoints.begin(), bottom_nose_.begin(), bottom_nose_.end()); + allPoints.insert(allPoints.begin(), left_eyebrow_.begin(), left_eyebrow_.end()); + allPoints.insert(allPoints.begin(), right_eyebrow_.begin(), right_eyebrow_.end()); + allPoints.insert(allPoints.begin(), left_eye_.begin(), left_eye_.end()); + allPoints.insert(allPoints.begin(), right_eye_.begin(), right_eye_.end()); + allPoints.insert(allPoints.begin(), outer_lips_.begin(), outer_lips_.end()); + allPoints.insert(allPoints.begin(), inside_lips_.begin(), inside_lips_.end()); + + return allPoints; + } + + //Returns all feature points in fixed order + vector> features() const { + return {chin_, + top_nose_, + bottom_nose_, + left_eyebrow_, + right_eyebrow_, + left_eye_, + right_eye_, + outer_lips_, + inside_lips_}; + } + + size_t empty() const { + return points().empty(); + } +}; + +using namespace cv::v4d; + +class BeautyDemoPlan : public Plan { +public: + using Plan::Plan; +private: + cv::Size downSize_; + + static struct Params { + int blurSkinKernelSize_ = 0; + //Saturation boost factor for eyes and lips + float eyesAndLipsSaturation_ = 1.8f; + //Saturation boost factor for skin + float skinSaturation_ = 1.4f; + //Contrast factor skin + float skinContrast_ = 0.7f; + //Show input and output side by side + bool sideBySide_ = false; + //Scale the video to the window size + bool stretch_ = true; + } params_; + + struct Cache { + vector channels_; + cv::UMat hls_; + cv::UMat blur_; + cv::UMat frameOutFloat_; + cv::UMat bgra_; + } cache_; + + struct Frames { + //BGR + cv::UMat orig_, down_, contrast_, faceOval_, eyesAndLips_, skin_; + cv::UMat lhalf_; + cv::UMat rhalf_; + //GREY + cv::UMat faceSkinMaskGrey_, eyesAndLipsMaskGrey_, backgroundMaskGrey_; + } frames_; + + //results of face detection and facemark + struct Face { + vector> shapes_; + std::vector faceRects_; + bool found_ = false; + FaceFeatures features_; + } face_; + + //the frame holding the final composed image + cv::UMat frameOut_; + cv::Ptr facemark_ = cv::face::createFacemarkLBF(); + //Blender (used to put the different face parts back together) + cv::Ptr blender_ = new cv::detail::MultiBandBlender(true, 5); + //Face detector + cv::Ptr detector_; + + //based on the detected FaceFeatures it guesses a decent face oval and draws a mask for it. + static void draw_face_oval_mask(const FaceFeatures &ff) { + using namespace cv::v4d::nvg; + clear(); + + cv::RotatedRect rotRect = cv::fitEllipse(ff.points()); + + beginPath(); + fillColor(cv::Scalar(255, 255, 255, 255)); + ellipse(rotRect.center.x, rotRect.center.y * 0.875, rotRect.size.width / 2, rotRect.size.height / 1.75); + rotate(rotRect.angle); + fill(); + } + + //Draws a mask consisting of eyes and lips areas (deduced from FaceFeatures) + static void draw_face_eyes_and_lips_mask(const FaceFeatures &ff) { + using namespace cv::v4d::nvg; + clear(); + vector> features = ff.features(); + for (size_t j = 5; j < 8; ++j) { + beginPath(); + fillColor(cv::Scalar(255, 255, 255, 255)); + moveTo(features[j][0].x, features[j][0].y); + for (size_t k = 1; k < features[j].size(); ++k) { + lineTo(features[j][k].x, features[j][k].y); + } + closePath(); + fill(); + } + + beginPath(); + fillColor(cv::Scalar(0, 0, 0, 255)); + moveTo(features[8][0].x, features[8][0].y); + for (size_t k = 1; k < features[8].size(); ++k) { + lineTo(features[8][k].x, features[8][k].y); + } + closePath(); + fill(); + } + + //adjusts the saturation of a UMat + static void adjust_saturation(const cv::UMat &srcBGR, cv::UMat &dstBGR, float factor, Cache& cache) { + cvtColor(srcBGR, cache.hls_, cv::COLOR_BGR2HLS); + split(cache.hls_, cache.channels_); + cv::multiply(cache.channels_[2], factor, cache.channels_[2]); + merge(cache.channels_, cache.hls_); + cvtColor(cache.hls_, dstBGR, cv::COLOR_HLS2BGR); + } +public: + + void gui(cv::Ptr window) override { + window->imgui([](cv::Ptr win, ImGuiContext* ctx, Params& params){ + using namespace ImGui; + SetCurrentContext(ctx); + Begin("Effect"); + Text("Display"); + Checkbox("Side by side", ¶ms.sideBySide_); + if(Checkbox("Stetch", ¶ms.stretch_)) { + win->setStretching(true); + } else + win->setStretching(false); + + if(Button("Fullscreen")) { + win->setFullscreen(!win->isFullscreen()); + }; + + if(Button("Offscreen")) { + win->setVisible(!win->isVisible()); + }; + + Text("Face Skin"); + SliderInt("Blur", ¶ms.blurSkinKernelSize_, 1, 128); + SliderFloat("Saturation", ¶ms.skinSaturation_, 0.0f, 100.0f); + SliderFloat("Contrast", ¶ms.skinContrast_, 0.0f, 1.0f); + Text("Eyes and Lips"); + SliderFloat("Saturation ", ¶ms.eyesAndLipsSaturation_, 0.0f, 100.0f); + End(); + }, params_); + } + void setup(cv::Ptr window) override { + int w = size().width; + int h = size().height; + downSize_ = { std::min(w, std::max(640, int(round(w / 2.0)))), std::min(h, std::max(360, int(round(h / 2.0)))) }; + detector_ = cv::FaceDetectorYN::create("modules/v4d/assets/models/face_detection_yunet_2023mar.onnx", "", downSize_, 0.9, 0.3, 5000, cv::dnn::DNN_BACKEND_OPENCV, cv::dnn::DNN_TARGET_OPENCL); + int diag = hypot(double(size().width), double(size().height)); + params_.blurSkinKernelSize_ = std::max(int(diag / 2000 % 2 == 0 ? diag / 2000 + 1 : diag / 2000), 1); + + window->setStretching(params_.stretch_); + window->plain([](cv::Ptr& facemark){ + facemark->loadModel("modules/v4d/assets/models/lbfmodel.yaml"); + }, facemark_); + } + void infer(cv::Ptr window) override { + try { + window->branch(always_); + { + window->capture(); + + //Save the video frame as BGR + window->fb([](const cv::UMat &framebuffer, const cv::Rect& viewport, const cv::Size& downSize, Frames& frames) { + cvtColor(framebuffer(viewport), frames.orig_, cv::COLOR_BGRA2BGR); + + //Downscale the video frame for face detection + cv::resize(frames.orig_, frames.down_, downSize); + }, viewport(), downSize_, frames_); + + window->plain([](const cv::Size sz, cv::Ptr& detector, cv::Ptr& facemark, const cv::UMat& down, Face& face) { + face.shapes_.clear(); + cv::Mat faces; + //Detect faces in the down-scaled image + detector->detect(down, faces); + //Only add the first face + cv::Rect faceRect; + if(!faces.empty()) + faceRect = cv::Rect(int(faces.at(0, 0)), int(faces.at(0, 1)), int(faces.at(0, 2)), int(faces.at(0, 3))); + face.faceRects_ = {faceRect}; + //find landmarks if faces have been detected + face.found_ = !faceRect.empty() && facemark->fit(down, face.faceRects_, face.shapes_); + if(face.found_) + face.features_ = FaceFeatures(face.faceRects_[0], face.shapes_[0], float(down.size().width) / sz.width); + }, size(), detector_, facemark_, frames_.down_, face_); + } + window->endbranch(always_); + + window->branch(isTrue_, face_.found_); + { + window->nvg([](const FaceFeatures& features) { + //Draw the face oval of the first face + draw_face_oval_mask(features); + }, face_.features_); + + window->fb([](const cv::UMat& framebuffer, const cv::Rect& viewport, cv::UMat& faceOval) { + //Convert/Copy the mask + cvtColor(framebuffer(viewport), faceOval, cv::COLOR_BGRA2GRAY); + }, viewport(), frames_.faceOval_); + + window->nvg([](const FaceFeatures& features) { + //Draw eyes eyes and lips areas of the first face + draw_face_eyes_and_lips_mask(features); + }, face_.features_); + + window->fb([](const cv::UMat &framebuffer, const cv::Rect& viewport, cv::UMat& eyesAndLipsMaskGrey) { + //Convert/Copy the mask + cvtColor(framebuffer(viewport), eyesAndLipsMaskGrey, cv::COLOR_BGRA2GRAY); + }, viewport(), frames_.eyesAndLipsMaskGrey_); + + window->plain([](Frames& frames, const Params& params, Cache& cache) { + //Create the skin mask + cv::subtract(frames.faceOval_, frames.eyesAndLipsMaskGrey_, frames.faceSkinMaskGrey_); + //Create the background mask + cv::bitwise_not(frames.faceOval_, frames.backgroundMaskGrey_); + //boost saturation of eyes and lips + adjust_saturation(frames.orig_, frames.eyesAndLips_, params.eyesAndLipsSaturation_, cache); + //reduce skin contrast + multiply(frames.orig_, cv::Scalar::all(params.skinContrast_), frames.contrast_); + //fix skin brightness + add(frames.contrast_, cv::Scalar::all((1.0 - params.skinContrast_) / 2.0) * 255.0, frames.contrast_); + //blur the skin_ + cv::boxFilter(frames.contrast_, cache.blur_, -1, cv::Size(params.blurSkinKernelSize_, params.blurSkinKernelSize_), cv::Point(-1, -1), true, cv::BORDER_REPLICATE); + //boost skin saturation + adjust_saturation(cache.blur_, frames.skin_, params.skinSaturation_, cache); + }, frames_, params_, cache_); + + window->plain([](cv::Ptr& bl, Frames& frames, cv::UMat& frameOut, Cache& cache) { + CV_Assert(!frames.skin_.empty()); + CV_Assert(!frames.eyesAndLips_.empty()); + //piece it all together + bl->prepare(cv::Rect(0, 0, frames.skin_.cols, frames.skin_.rows)); + bl->feed(frames.skin_, frames.faceSkinMaskGrey_, cv::Point(0, 0)); + bl->feed(frames.orig_, frames.backgroundMaskGrey_, cv::Point(0, 0)); + bl->feed(frames.eyesAndLips_, frames.eyesAndLipsMaskGrey_, cv::Point(0, 0)); + bl->blend(cache.frameOutFloat_, cv::UMat()); + CV_Assert(!cache.frameOutFloat_.empty()); + cache.frameOutFloat_.convertTo(frameOut, CV_8U, 1.0); + }, blender_, frames_, frameOut_, cache_); + + window->plain([](const cv::Size& sz, const cv::UMat& orig, cv::UMat& frameOut, cv::UMat lhalf, cv::UMat rhalf, const Params& params) { + if (params.sideBySide_) { + //create side-by-side view with a result + cv::resize(orig, lhalf, cv::Size(0, 0), 0.5, 0.5); + cv::resize(frameOut, rhalf, cv::Size(0, 0), 0.5, 0.5); + + frameOut = cv::Scalar::all(0); + lhalf.copyTo(frameOut(cv::Rect(0, sz.height / 2.0, lhalf.size().width, lhalf.size().height))); + rhalf.copyTo(frameOut(cv::Rect(sz.width / 2.0, sz.height / 2.0, lhalf.size().width, lhalf.size().height))); + } + }, size(), frames_.orig_, frameOut_, frames_.lhalf_, frames_.rhalf_, params_); + } + window->endbranch(isTrue_, face_.found_); + + window->branch(isFalse_, face_.found_); + { + window->plain([](const cv::Size& sz, const cv::UMat& orig, cv::UMat& frameOut, cv::UMat lhalf, const Params& params) { + if (params.sideBySide_) { + //create side-by-side view without a result (using the input image for both sides) + frameOut = cv::Scalar::all(0); + cv::resize(orig, lhalf, cv::Size(0, 0), 0.5, 0.5); + lhalf.copyTo(frameOut(cv::Rect(0, sz.height / 2.0, lhalf.size().width, lhalf.size().height))); + lhalf.copyTo(frameOut(cv::Rect(sz.width / 2.0, sz.height / 2.0, lhalf.size().width, lhalf.size().height))); + } else { + orig.copyTo(frameOut); + } + }, size(), frames_.orig_, frameOut_, frames_.lhalf_, params_); + } + window->endbranch(isFalse_, face_.found_); + + window->branch(always_); + { + //write the result to the framebuffer + window->fb([](cv::UMat& framebuffer, const cv::Rect& viewport, const cv::UMat& f, Cache& cache) { + cvtColor(f, cache.bgra_, cv::COLOR_BGR2BGRA); + cv::resize(cache.bgra_, framebuffer(viewport), viewport.size()); + }, viewport(), frameOut_, cache_); + + //write the current framebuffer to video + window->write(); + } + window->endbranch(always_); + + } catch (std::exception &ex) { + cerr << ex.what() << endl; + } + } +}; + +BeautyDemoPlan::Params BeautyDemoPlan::params_; + +int main(int argc, char **argv) { + if (argc != 2) { + cerr << "Usage: beauty-demo " << endl; + exit(1); + } + + cv::Ptr plan = new BeautyDemoPlan(cv::Size(1920, 1080)); + cv::Ptr window = V4D::make(plan->size(), "Beautification Demo", ALL); + auto src = makeCaptureSource(window, argv[1]); + auto sink = makeWriterSink(window, "beauty-demo.mkv", src->fps(), plan->size()); + window->setSource(src); + window->setSink(sink); + window->run(plan); + + return 0; +} diff --git a/modules/v4d/samples/bgfx-demo.cpp b/modules/v4d/samples/bgfx-demo.cpp new file mode 100644 index 000000000..c9c23fb46 --- /dev/null +++ b/modules/v4d/samples/bgfx-demo.cpp @@ -0,0 +1,75 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include + +using namespace cv::v4d; + + +class DisplayImageBgfx : public Plan { + Property vp_ = P(V4D::Keys::VIEWPORT); +public: + void setup() override { + bgfx([](const cv::Rect& vp) { + // Set view 0 clear state. + bgfx::setViewClear(0 + , BGFX_CLEAR_COLOR|BGFX_CLEAR_DEPTH + , 0x303030ff + , 1.0f + , 0 + ); + + // Set view 0 default viewport. + bgfx::setViewRect(0, vp.x, vp.y, uint16_t(vp.width), uint16_t(vp.height)); + }, vp_); + } + + void infer() override { + bgfx([](const cv::Rect& vp) { + + // This dummy draw call is here to make sure that view 0 is cleared + // if no other draw calls are submitted to view 0. + bgfx::touch(0); + + // Use debug font to print information about this example. + bgfx::dbgTextClear(); + + const bgfx::Stats* stats = bgfx::getStats(); + + bgfx::dbgTextPrintf( + bx::max(uint16_t(stats->textWidth/2), 20)-20 + , bx::max(uint16_t(stats->textHeight/2), 6)-6 + , 40 + , "Hello %s" + , "World" + ); + bgfx::dbgTextPrintf(0, 1, 0x0f, "Color can be changed with ANSI \x1b[9;me\x1b[10;ms\x1b[11;mc\x1b[12;ma\x1b[13;mp\x1b[14;me\x1b[0m code too."); + + bgfx::dbgTextPrintf(80, 1, 0x0f, "\x1b[;0m \x1b[;1m \x1b[; 2m \x1b[; 3m \x1b[; 4m \x1b[; 5m \x1b[; 6m \x1b[; 7m \x1b[0m"); + bgfx::dbgTextPrintf(80, 2, 0x0f, "\x1b[;8m \x1b[;9m \x1b[;10m \x1b[;11m \x1b[;12m \x1b[;13m \x1b[;14m \x1b[;15m \x1b[0m"); + + bgfx::dbgTextPrintf(0, 2, 0x0f, "Backbuffer %dW x %dH in pixels, debug text %dW x %dH in characters." + , stats->width + , stats->height + , stats->textWidth + , stats->textHeight + ); + + // Advance to next frame. Rendering thread will be kicked to + // process submitted rendering primitives. + bgfx::frame(); + + }, vp_); + } +}; + + +int main(int argc, char** argv) { + cv::Rect viewport(0,0, 1280, 720); + cv::Ptr runtime = V4D::init(viewport, "Display an image using bgfx", AllocateFlags::BGFX | AllocateFlags::IMGUI); + Plan::run(0); + + return 0; +} diff --git a/modules/v4d/samples/bgfx-demo2.cpp b/modules/v4d/samples/bgfx-demo2.cpp new file mode 100644 index 000000000..4b4b85cb3 --- /dev/null +++ b/modules/v4d/samples/bgfx-demo2.cpp @@ -0,0 +1,287 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include + +using namespace cv::v4d; + +// based on: https://github.com/bkaradzic/bgfx/blob/07be0f213acd73a4f6845dc8f7b20b93f66b7cc4/examples/01-cubes/cubes.cpp +class BgfxDemoPlan : public Plan { + struct PosColorVertex + { + float x_; + float y_; + float z_; + uint32_t abgr_; + + static void init() + { + layout + .begin() + .add(bgfx::Attrib::Position, 3, bgfx::AttribType::Float) + .add(bgfx::Attrib::Color0, 4, bgfx::AttribType::Uint8, true) + .end(); + }; + + inline static bgfx::VertexLayout layout; + }; + + inline static const PosColorVertex CUBE_VERTICES[] = + { + {-0.30f, 0.30f, 0.30f, 0xaa000000 }, + { 0.30f, 0.30f, 0.30f, 0xaa0000ff }, + {-0.30f, -0.30f, 0.30f, 0xaa00ff00 }, + { 0.30f, -0.30f, 0.30f, 0xaa00ffff }, + {-0.30f, 0.30f, -0.30f, 0xaaff0000 }, + { 0.30f, 0.30f, -0.30f, 0xaaff00ff }, + {-0.30f, -0.30f, -0.30f, 0xaaffff00 }, + { 0.30f, -0.30f, -0.30f, 0xaaffffff }, + }; + + inline static const uint16_t CUBE_TRI_LIST[] = + { + 0, 1, 2, // 0 + 1, 3, 2, + 4, 6, 5, // 2 + 5, 6, 7, + 0, 2, 4, // 4 + 4, 2, 6, + 1, 5, 3, // 6 + 5, 7, 3, + 0, 4, 1, // 8 + 4, 5, 1, + 2, 3, 6, // 10 + 6, 3, 7, + }; + + inline static const uint16_t CUBE_TRI_STRIP[] = + { + 0, 1, 2, + 3, + 7, + 1, + 5, + 0, + 4, + 2, + 6, + 7, + 4, + 5, + }; + + inline static const uint16_t CUBE_LINE_LIST[] = + { + 0, 1, + 0, 2, + 0, 4, + 1, 3, + 1, 5, + 2, 3, + 2, 6, + 3, 7, + 4, 5, + 4, 6, + 5, 7, + 6, 7, + }; + + inline static const uint16_t CUBE_LINE_STRIP[] = + { + 0, 2, 3, 1, 5, 7, 6, 4, + 0, 2, 6, 4, 5, 7, 3, 1, + 0, + }; + + inline static const uint16_t CUBE_POINTS[] = + { + 0, 1, 2, 3, 4, 5, 6, 7 + }; + + inline static const char* PT_NAMES[] + { + "Triangle List", + "Triangle Strip", + "Lines", + "Line Strip", + "Points", + }; + + inline static const uint64_t PT_STATE[] + { + UINT64_C(0), + BGFX_STATE_PT_TRISTRIP, + BGFX_STATE_PT_LINES, + BGFX_STATE_PT_LINESTRIP, + BGFX_STATE_PT_POINTS, + }; + + struct Params { + uint32_t width_; + uint32_t height_; + bgfx::VertexBufferHandle vbh_; + bgfx::IndexBufferHandle ibh_[BX_COUNTOF(PT_STATE)]; + bgfx::ProgramHandle program_; + int32_t pt_ = 0; + + bool red_ = true; + bool green_ = true; + bool blue_ = true; + bool alpha_ = true; + } params_; + + inline static int64_t time_offset_; + + Property vp_ = P(V4D::Keys::VIEWPORT); +public: + BgfxDemoPlan(){ + + } + void setup() override { + branch(BranchType::ONCE, always_) + ->plain([](int64_t& timeOffset) { + timeOffset = bx::getHPCounter(); + }, RWS(time_offset_)) + ->endBranch(); + + bgfx([](const cv::Rect& vp, Params& params){ + params.width_ = vp.width; + params.height_ = vp.height; + // Set view 0 clear state. + bgfx::setViewClear(0 + , BGFX_CLEAR_COLOR|BGFX_CLEAR_DEPTH + , 0x00000000 + , 1.0f + , 0 + ); + PosColorVertex::init(); + + // Set view 0 default viewport. + bgfx::setViewRect(0, vp.x, vp.y, uint16_t(vp.width), uint16_t(vp.height)); + + // Create static vertex buffer. + params.vbh_ = bgfx::createVertexBuffer( + // Static data can be passed with bgfx::makeRef + bgfx::makeRef(CUBE_VERTICES, sizeof(CUBE_VERTICES) ) + , PosColorVertex::layout + ); + + // Create static index buffer for triangle list rendering. + params.ibh_[0] = bgfx::createIndexBuffer( + // Static data can be passed with bgfx::makeRef + bgfx::makeRef(CUBE_TRI_LIST, sizeof(CUBE_TRI_LIST) ) + ); + + // Create static index buffer for triangle strip rendering. + params.ibh_[1] = bgfx::createIndexBuffer( + // Static data can be passed with bgfx::makeRef + bgfx::makeRef(CUBE_TRI_STRIP, sizeof(CUBE_TRI_STRIP) ) + ); + + // Create static index buffer for line list rendering. + params.ibh_[2] = bgfx::createIndexBuffer( + // Static data can be passed with bgfx::makeRef + bgfx::makeRef(CUBE_LINE_LIST, sizeof(CUBE_LINE_LIST) ) + ); + + // Create static index buffer for line strip rendering. + params.ibh_[3] = bgfx::createIndexBuffer( + // Static data can be passed with bgfx::makeRef + bgfx::makeRef(CUBE_LINE_STRIP, sizeof(CUBE_LINE_STRIP) ) + ); + + // Create static index buffer for point list rendering. + params.ibh_[4] = bgfx::createIndexBuffer( + // Static data can be passed with bgfx::makeRef + bgfx::makeRef(CUBE_POINTS, sizeof(CUBE_POINTS) ) + ); + + // Create program from shaders. + params.program_ = util::load_program("vs_cubes", "fs_cubes"); + + }, vp_, RW(params_)); + } + + void infer() override { + bgfx([](const Params& params, const int64_t timeOffset) { + float time = (float)( (bx::getHPCounter()-timeOffset)/double(bx::getHPFrequency())); + + const bx::Vec3 at = { 0.0f, 0.0f, 0.0f }; + const bx::Vec3 eye = { 0.0f, 0.0f, -35.0f }; + + // Set view and projection matrix for view 0. + { + + float view[16]; + bx::mtxLookAt(view, eye, at); + + float proj[16]; + bx::mtxProj(proj, 60.0f, float(params.width_)/float(params.height_), 0.1f, 100.0f, bgfx::getCaps()->homogeneousDepth); + + bgfx::setViewTransform(0, view, proj); + + // Set view 0 default viewport. + bgfx::setViewRect(0, 0, 0, uint16_t(params.width_), uint16_t(params.height_) ); + } + + // This dummy draw call is here to make sure that view 0 is cleared + // if no other draw calls are submitted to view 0. + bgfx::touch(0); + + bgfx::IndexBufferHandle ibh = params.ibh_[params.pt_]; + uint64_t state = 0 + | (params.red_ ? BGFX_STATE_WRITE_R : 0) + | (params.green_ ? BGFX_STATE_WRITE_G : 0) + | (params.blue_ ? BGFX_STATE_WRITE_B : 0) + | (params.alpha_ ? BGFX_STATE_WRITE_A : 0) + | BGFX_STATE_WRITE_Z + | BGFX_STATE_DEPTH_TEST_LESS + | BGFX_STATE_CULL_CW + | BGFX_STATE_MSAA + | PT_STATE[params.pt_] + ; + + + // Submit 11x11 cubes. + for (uint32_t yy = 0; yy < 100; ++yy) + { + for (uint32_t xx = 0; xx < 100; ++xx) + { + float mtx[16]; + float angle = fmod(float(time) + sin((float(xx * yy / pow(170.0f, 2.0f)) * 2.0f - 1.0f) * CV_PI), 2.0f * CV_PI); + bx::mtxRotateXYZ(mtx, angle, angle, angle); + mtx[12] = ((xx / 100.0) * 2.0 - 1.0) * 30.0; + mtx[13] = ((yy / 100.0) * 2.0 - 1.0) * 30.0; + mtx[14] = 0.0f; + + // Set model matrix for rendering. + bgfx::setTransform(mtx); + + // Set vertex and index buffer. + bgfx::setVertexBuffer(0, params.vbh_); + bgfx::setIndexBuffer(ibh); + + // Set render states. + bgfx::setState(state); + + // Submit primitive for rendering to view 0. + bgfx::submit(0, params.program_); + } + } + + // Advance to next frame. Rendering thread will be kicked to + // process submitted rendering primitives. + bgfx::frame(); + }, R(params_), CS(time_offset_)); + } +}; + + +int main(int argc, char** argv) { + cv::Ptr runtime = V4D::init(cv::Rect(0,0, 1280, 720), "Bgfx Demo", AllocateFlags::BGFX | AllocateFlags::IMGUI); + Plan::run(std::stoi(argv[1])); + + return 0; +} diff --git a/modules/v4d/samples/cube-demo.cpp b/modules/v4d/samples/cube-demo.cpp new file mode 100644 index 000000000..55deca3b7 --- /dev/null +++ b/modules/v4d/samples/cube-demo.cpp @@ -0,0 +1,248 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include +//adapted from https://gitlab.com/wikibooks-opengl/modern-tutorials/-/blob/master/tut05_cube/cube.cpp + +using namespace cv::v4d; + +class CubeDemoPlan : public Plan { +public: + using Plan::Plan; + + /* Demo Parameters */ + int glowKernelSize_ = 0; + + /* OpenGL constants */ + constexpr static GLuint TRIANGLES_ = 12; + constexpr static GLuint VERTICES_INDEX_ = 0; + constexpr static GLuint COLOR_INDEX_ = 1; + + //Cube vertices, colors and indices + constexpr static float VERTICES[24] = { + // Front face + 0.5, 0.5, 0.5, -0.5, 0.5, 0.5, -0.5, -0.5, 0.5, 0.5, -0.5, 0.5, + // Back face + 0.5, 0.5, -0.5, -0.5, 0.5, -0.5, -0.5, -0.5, -0.5, 0.5, -0.5, -0.5 + }; + + constexpr static float VERTEX_COLORS_[24] = { + 1.0, 0.4, 0.6, 1.0, 0.9, 0.2, 0.7, 0.3, 0.8, 0.5, 0.3, 1.0, + 0.2, 0.6, 1.0, 0.6, 1.0, 0.4, 0.6, 0.8, 0.8, 0.4, 0.8, 0.8 + }; + + constexpr static unsigned short TRIANGLE_INDICES_[36] = { + // Front + 0, 1, 2, 2, 3, 0, + + // Right + 0, 3, 7, 7, 4, 0, + + // Bottom + 2, 6, 7, 7, 3, 2, + + // Left + 1, 5, 6, 6, 2, 1, + + // Back + 4, 7, 6, 6, 5, 4, + + // Top + 5, 1, 0, 0, 4, 5 + }; +private: + struct Cache { + cv::UMat down_; + cv::UMat up_; + cv::UMat blur_; + cv::UMat dst16_; + } cache_; + GLuint vao_ = 0; + GLuint shaderProgram_ = 0; + GLuint uniformTransform_= 0; + + //Simple transform & pass-through shaders + static GLuint load_shader() { + //Shader versions "330" and "300 es" are very similar. + //If you are careful you can write the same code for both versions. + #if !defined(OPENCV_V4D_USE_ES3) + const string shaderVersion = "330"; + #else + const string shaderVersion = "300 es"; + #endif + + const string vert = + " #version " + shaderVersion + + R"( + precision lowp float; + layout(location = 0) in vec3 pos; + layout(location = 1) in vec3 vertex_color; + + uniform mat4 transform; + + out vec3 color; + void main() { + gl_Position = transform * vec4(pos, 1.0); + color = vertex_color; + } + )"; + + const string frag = + " #version " + shaderVersion + + R"( + precision lowp float; + in vec3 color; + + out vec4 frag_color; + + void main() { + frag_color = vec4(color, 1.0); + } + )"; + + //Initialize the shaders and returns the program + unsigned int handles[3]; + cv::v4d::initShader(handles, vert.c_str(), frag.c_str(), "fragColor"); + return handles[0]; + } + + //Initializes objects, buffers, shaders and uniforms + static void init_scene(const cv::Size& sz, GLuint& vao, GLuint& shaderProgram, GLuint& uniformTransform) { + glEnable (GL_DEPTH_TEST); + + glGenVertexArrays(1, &vao); + glBindVertexArray(vao); + + unsigned int triangles_ebo; + glGenBuffers(1, &triangles_ebo); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, triangles_ebo); + glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof TRIANGLE_INDICES_, TRIANGLE_INDICES_, + GL_STATIC_DRAW); + + unsigned int verticies_vbo; + glGenBuffers(1, &verticies_vbo); + glBindBuffer(GL_ARRAY_BUFFER, verticies_vbo); + glBufferData(GL_ARRAY_BUFFER, sizeof VERTICES, VERTICES, GL_STATIC_DRAW); + + glVertexAttribPointer(VERTICES_INDEX_, 3, GL_FLOAT, GL_FALSE, 0, NULL); + glEnableVertexAttribArray(VERTICES_INDEX_); + + unsigned int colors_vbo; + glGenBuffers(1, &colors_vbo); + glBindBuffer(GL_ARRAY_BUFFER, colors_vbo); + glBufferData(GL_ARRAY_BUFFER, sizeof VERTEX_COLORS_, VERTEX_COLORS_, GL_STATIC_DRAW); + + glVertexAttribPointer(COLOR_INDEX_, 3, GL_FLOAT, GL_FALSE, 0, NULL); + glEnableVertexAttribArray(COLOR_INDEX_); + + glBindVertexArray(0); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0); + glBindBuffer(GL_ARRAY_BUFFER, 0); + + shaderProgram = load_shader(); + uniformTransform = glGetUniformLocation(shaderProgram, "transform"); + glViewport(0,0, sz.width, sz.height); + } + + //Renders a rotating rainbow-colored cube on a blueish background + static void render_scene(GLuint &vao, GLuint &shaderProgram, + GLuint &uniformTransform) { + //Clear the background + glClearColor(0.2, 0.24, 0.4, 1); + glClear(GL_COLOR_BUFFER_BIT); + + //Use the prepared shader program + glUseProgram(shaderProgram); + + //Scale and rotate the cube depending on the current time. + float angle = fmod( + double(cv::getTickCount()) / double(cv::getTickFrequency()), + 2 * M_PI); + float scale = 0.25; + + cv::Matx44f scaleMat(scale, 0.0, 0.0, 0.0, 0.0, scale, 0.0, 0.0, 0.0, 0.0, + scale, 0.0, 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotXMat(1.0, 0.0, 0.0, 0.0, 0.0, cos(angle), -sin(angle), 0.0, + 0.0, sin(angle), cos(angle), 0.0, 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotYMat(cos(angle), 0.0, sin(angle), 0.0, 0.0, 1.0, 0.0, 0.0, + -sin(angle), 0.0, cos(angle), 0.0, 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotZMat(cos(angle), -sin(angle), 0.0, 0.0, sin(angle), + cos(angle), 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0); + + //calculate the transform + cv::Matx44f transform = scaleMat * rotXMat * rotYMat * rotZMat; + //set the corresponding uniform + glUniformMatrix4fv(uniformTransform, 1, GL_FALSE, transform.val); + //Bind the prepared vertex array object + glBindVertexArray(vao); + //Draw + glDrawElements(GL_TRIANGLES, TRIANGLES_ * 3, GL_UNSIGNED_SHORT, NULL); + } + + //applies a glow effect to an image + static void glow_effect(const cv::UMat& src, cv::UMat& dst, const int ksize, Cache& cache) { + cv::bitwise_not(src, dst); + + //Resize for some extra performance + cv::resize(dst, cache.down_, cv::Size(), 0.5, 0.5); + //Cheap blur + cv::boxFilter(cache.down_, cache.blur_, -1, cv::Size(ksize, ksize), cv::Point(-1, -1), true, + cv::BORDER_REPLICATE); + //Back to original size + cv::resize(cache.blur_, cache.up_, src.size()); + + //Multiply the src image with a blurred version of itself + cv::multiply(dst, cache.up_, cache.dst16_, 1, CV_16U); + //Normalize and convert back to CV_8U + cv::divide(cache.dst16_, cv::Scalar::all(255.0), dst, 1, CV_8U); + + cv::bitwise_not(dst, dst); + } + +public: + void setup(cv::Ptr window) override { + int diag = hypot(double(size().width), double(size().height)); + glowKernelSize_ = std::max(int(diag / 138 % 2 == 0 ? diag / 138 + 1 : diag / 138), 1); + window->gl([](const cv::Size& sz, GLuint& v, GLuint& sp, GLuint& ut){ + init_scene(sz, v, sp, ut); + }, size(), vao_, shaderProgram_, uniformTransform_); + } + + void infer(cv::Ptr window) override { + window->gl([](){ + //Clear the background + glClearColor(0.2f, 0.24f, 0.4f, 1.0f); + glClear(GL_COLOR_BUFFER_BIT); + }); + + //Render using multiple OpenGL contexts + window->gl([](GLuint& v, GLuint& sp, GLuint& ut){ + render_scene(v, sp, ut); + }, vao_, shaderProgram_, uniformTransform_); + + //Aquire the frame buffer for use by OpenCV + window->fb([](cv::UMat& framebuffer, const cv::Rect& viewport, int glowKernelSize, Cache& cache) { + cv::UMat roi = framebuffer(viewport); + glow_effect(roi, roi, glowKernelSize, cache); + }, viewport(), glowKernelSize_, cache_); + + window->write(); + } +}; + +int main() { + cv::Ptr plan = new CubeDemoPlan(cv::Size(1280, 720)); + cv::Ptr window = V4D::make(plan->size(), "Cube Demo", ALL); + + //Creates a writer sink (which might be hardware accelerated) + auto sink = makeWriterSink(window, "cube-demo.mkv", 60, plan->size()); + window->setSink(sink); + window->run(plan); + + return 0; +} diff --git a/modules/v4d/samples/custom_source_and_sink.cpp b/modules/v4d/samples/custom_source_and_sink.cpp new file mode 100644 index 000000000..89170f73e --- /dev/null +++ b/modules/v4d/samples/custom_source_and_sink.cpp @@ -0,0 +1,62 @@ +#include +#include + +using namespace cv; +using namespace cv::v4d; + +class CustomSourceAndSinkPlan : public Plan { + string hr_ = "Hello Rainbow!"; +public: + CustomSourceAndSinkPlan(const cv::Size& sz) : Plan(sz) { + } + + void infer(cv::Ptr win) override { + win->capture(); + + //Render "Hello Rainbow!" over the video + win->nvg([](const Size& sz, const string& str) { + using namespace cv::v4d::nvg; + + fontSize(40.0f); + fontFace("sans-bold"); + fillColor(Scalar(255, 0, 0, 255)); + textAlign(NVG_ALIGN_CENTER | NVG_ALIGN_TOP); + text(sz.width / 2.0, sz.height / 2.0, str.c_str(), str.c_str() + str.size()); + }, win->fbSize(), hr_); + + win->write(); + } +}; + +int main() { + Ptr plan = new CustomSourceAndSinkPlan(cv::Size(960, 960)); + Ptr window = V4D::make(plan->size(), "Custom Source/Sink"); + + //Make a source that generates rainbow frames. + cv::Ptr src = new Source([](cv::UMat& frame){ + static long cnt = 0; + //The source is responsible for initializing the frame.. + if(frame.empty()) + frame.create(Size(960, 960), CV_8UC3); + frame = colorConvert(Scalar(++cnt % 180, 128, 128, 255), COLOR_HLS2BGR); + return true; + }, 60.0f); + + //Make a sink the saves each frame to a PNG file (does nothing in case of WebAssembly). + cv::Ptr sink = new Sink([](const uint64_t& seq, const cv::UMat& frame){ + try { + imwrite(std::to_string(seq) + ".png", frame); + } catch(std::exception& ex) { + cerr << "Unable to write frame: " << ex.what() << endl; + return false; + } + return true; + }); + + //Attach source and sink + window->setSource(src); + window->setSink(sink); + + window->run(plan); +} + diff --git a/modules/v4d/samples/display_image.cpp b/modules/v4d/samples/display_image.cpp new file mode 100644 index 000000000..f07692195 --- /dev/null +++ b/modules/v4d/samples/display_image.cpp @@ -0,0 +1,32 @@ +#include +#include + +using namespace cv; +using namespace cv::v4d; + +class DisplayImagePlan : public Plan { + UMat image_; +public: + DisplayImagePlan(const cv::Size& sz) : Plan(sz) { + } + + void setup(Ptr win) override { + win->plain([](cv::UMat& image){ + image = imread(samples::findFile("lena.jpg")).getUMat(ACCESS_READ); + }, image_); + } + + void infer(Ptr win) override { + //Feeds the image to the video pipeline + win->feed(image_); + } +}; + +int main() { + cv::Ptr plan = new DisplayImagePlan(cv::Size(960,960)); + //Creates a V4D window for on screen rendering with a window size of 960x960 and a framebuffer of the same size. + //Please note that while the window size may change the framebuffer size may not. If you need multiple framebuffer + //sizes you need multiple V4D objects + cv::Ptr window = V4D::make(plan->size(), "Display an Image"); + window->run(plan); +} diff --git a/modules/v4d/samples/display_image_fb.cpp b/modules/v4d/samples/display_image_fb.cpp new file mode 100644 index 000000000..17ad59439 --- /dev/null +++ b/modules/v4d/samples/display_image_fb.cpp @@ -0,0 +1,40 @@ +#include +#include + +using namespace cv; +using namespace cv::v4d; + +class DisplayImageFB : public Plan { + UMat image_; + UMat converted_; +public: + DisplayImageFB(const cv::Size& sz) : Plan(sz) { + } + + void setup(cv::Ptr win) override { + win->plain([](cv::UMat& image, cv::UMat& converted, const cv::Size& sz){ + //Loads an image as a UMat (just in case we have hardware acceleration available) + image = imread(samples::findFile("lena.jpg")).getUMat(ACCESS_READ); + + //We have to manually resize and color convert the image when using direct frambuffer access. + resize(image, converted, sz); + cvtColor(converted, converted, COLOR_RGB2BGRA); + }, image_, converted_, win->fbSize()); + } + + void infer(Ptr win) override { + //Create a fb context and copy the prepared image to the framebuffer. The fb context + //takes care of retrieving and storing the data on the graphics card (using CL-GL + //interop if available), ready for other contexts to use + win->fb([](UMat& framebuffer, const cv::UMat& c){ + c.copyTo(framebuffer); + }, converted_); + } +}; + +int main() { + Ptr plan = new DisplayImageFB(cv::Size(960,960)); + //Creates a V4D object + Ptr window = V4D::make(plan->size(), "Display an Image through direct FB access"); + window->run(plan); +} diff --git a/modules/v4d/samples/display_image_nvg.cpp b/modules/v4d/samples/display_image_nvg.cpp new file mode 100644 index 000000000..7ccc382f6 --- /dev/null +++ b/modules/v4d/samples/display_image_nvg.cpp @@ -0,0 +1,59 @@ +#include +#include + +using namespace cv; +using namespace cv::v4d; + +class DisplayImageNVG : public Plan { + //A simple struct to hold our image variables + struct Image_t { + std::string filename_; + nvg::Paint paint_; + int w_; + int h_; + } image_; +public: + DisplayImageNVG(const cv::Size& sz) : Plan(sz) { + } + + void setup(Ptr win) override{ + //Set the filename + image_.filename_ = samples::findFile("lena.jpg"); + + //Creates a NanoVG context. The wrapped C-functions of NanoVG are available in the namespace cv::v4d::nvg; + win->nvg([](Image_t& img) { + using namespace cv::v4d::nvg; + //Create the image_ and receive a handle. + int handle = createImage(img.filename_.c_str(), NVG_IMAGE_NEAREST); + //Make sure it was created successfully + CV_Assert(handle > 0); + //Query the image_ size + imageSize(handle, &img.w_, &img.h_); + //Create a simple image_ pattern with the image dimensions + img.paint_ = imagePattern(0, 0, img.w_, img.h_, 0.0f/180.0f*NVG_PI, handle, 1.0); + }, image_); + } + + void infer(Ptr win) override{ + //Creates a NanoVG context to draw the loaded image_ over again to the screen. + win->nvg([](const Image_t& img, const cv::Size& sz) { + using namespace cv::v4d::nvg; + beginPath(); + //Scale all further calls to window size + scale(double(sz.width)/img.w_, double(sz.height)/img.h_); + //Create a rounded rectangle with the images dimensions. + //Note that actually this rectangle will have the size of the window + //because of the previous scale call. + roundedRect(0,0, img.w_, img.h_, 50); + //Fill the rounded rectangle with our picture + fillPaint(img.paint_); + fill(); + }, image_, win->fbSize()); + } +}; + +int main() { + Ptr plan = new DisplayImageNVG(cv::Size(960,960)); + Ptr window = V4D::make(plan->size(), "Display an Image using NanoVG"); + window->run(plan); +} diff --git a/modules/v4d/samples/example_v4d_beauty-demo.html b/modules/v4d/samples/example_v4d_beauty-demo.html new file mode 100644 index 000000000..44d2ab87c --- /dev/null +++ b/modules/v4d/samples/example_v4d_beauty-demo.html @@ -0,0 +1,264 @@ + + + + Beauty Demo + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + diff --git a/modules/v4d/samples/example_v4d_capture.sh b/modules/v4d/samples/example_v4d_capture.sh new file mode 100755 index 000000000..1108d1c60 --- /dev/null +++ b/modules/v4d/samples/example_v4d_capture.sh @@ -0,0 +1,271 @@ +#!/bin/bash + +title=$1 +name=$2 + +cat << EOF + + + + ${title} + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + +EOF diff --git a/modules/v4d/samples/example_v4d_cube-demo.html b/modules/v4d/samples/example_v4d_cube-demo.html new file mode 100644 index 000000000..33e2b25ab --- /dev/null +++ b/modules/v4d/samples/example_v4d_cube-demo.html @@ -0,0 +1,210 @@ + + + + Cube Demo + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_custom_source_and_sink.html b/modules/v4d/samples/example_v4d_custom_source_and_sink.html new file mode 100644 index 000000000..2c8a33255 --- /dev/null +++ b/modules/v4d/samples/example_v4d_custom_source_and_sink.html @@ -0,0 +1,210 @@ + + + + Custom Source and Sink + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_display_image.html b/modules/v4d/samples/example_v4d_display_image.html new file mode 100644 index 000000000..3f565981f --- /dev/null +++ b/modules/v4d/samples/example_v4d_display_image.html @@ -0,0 +1,210 @@ + + + + Display an Image through the Video-Pipeline + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_display_image_fb.html b/modules/v4d/samples/example_v4d_display_image_fb.html new file mode 100644 index 000000000..b87a4aa72 --- /dev/null +++ b/modules/v4d/samples/example_v4d_display_image_fb.html @@ -0,0 +1,210 @@ + + + + Display an Image through the FB Context + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_display_image_nvg.html b/modules/v4d/samples/example_v4d_display_image_nvg.html new file mode 100644 index 000000000..824fe8328 --- /dev/null +++ b/modules/v4d/samples/example_v4d_display_image_nvg.html @@ -0,0 +1,210 @@ + + + + Display an Image through NanoVG + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_font-demo.html b/modules/v4d/samples/example_v4d_font-demo.html new file mode 100644 index 000000000..fa8357846 --- /dev/null +++ b/modules/v4d/samples/example_v4d_font-demo.html @@ -0,0 +1,210 @@ + + + + Font Demo + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_font_rendering.html b/modules/v4d/samples/example_v4d_font_rendering.html new file mode 100644 index 000000000..ee3fdcc82 --- /dev/null +++ b/modules/v4d/samples/example_v4d_font_rendering.html @@ -0,0 +1,210 @@ + + + + Font rendering + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_font_with_gui.html b/modules/v4d/samples/example_v4d_font_with_gui.html new file mode 100644 index 000000000..cc9b0e3bf --- /dev/null +++ b/modules/v4d/samples/example_v4d_font_with_gui.html @@ -0,0 +1,210 @@ + + + + Font rendering with Form-based GUI + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_many_cubes-demo.html b/modules/v4d/samples/example_v4d_many_cubes-demo.html new file mode 100644 index 000000000..a408c8d2d --- /dev/null +++ b/modules/v4d/samples/example_v4d_many_cubes-demo.html @@ -0,0 +1,210 @@ + + + + Many Cubes Demo + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_nanovg-demo.html b/modules/v4d/samples/example_v4d_nanovg-demo.html new file mode 100644 index 000000000..3766fb0b2 --- /dev/null +++ b/modules/v4d/samples/example_v4d_nanovg-demo.html @@ -0,0 +1,264 @@ + + + + NanoVG Demo + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + diff --git a/modules/v4d/samples/example_v4d_nocapture.sh b/modules/v4d/samples/example_v4d_nocapture.sh new file mode 100755 index 000000000..39e6903d1 --- /dev/null +++ b/modules/v4d/samples/example_v4d_nocapture.sh @@ -0,0 +1,218 @@ +#!/bin/bash + +title=$1 +name=$2 + +cat << EOF + + + + ${title} + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + +EOF + diff --git a/modules/v4d/samples/example_v4d_optflow-demo.html b/modules/v4d/samples/example_v4d_optflow-demo.html new file mode 100644 index 000000000..8f3bd21cb --- /dev/null +++ b/modules/v4d/samples/example_v4d_optflow-demo.html @@ -0,0 +1,264 @@ + + + + Sparse Optical Flow Demo + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + diff --git a/modules/v4d/samples/example_v4d_pedestrian-demo.html b/modules/v4d/samples/example_v4d_pedestrian-demo.html new file mode 100644 index 000000000..4e06b065b --- /dev/null +++ b/modules/v4d/samples/example_v4d_pedestrian-demo.html @@ -0,0 +1,264 @@ + + + + Pedestrian Demo + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + diff --git a/modules/v4d/samples/example_v4d_render_opengl.html b/modules/v4d/samples/example_v4d_render_opengl.html new file mode 100644 index 000000000..9e0e0488b --- /dev/null +++ b/modules/v4d/samples/example_v4d_render_opengl.html @@ -0,0 +1,210 @@ + + + + Render OpenGL Blue Screen + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_shader-demo.html b/modules/v4d/samples/example_v4d_shader-demo.html new file mode 100644 index 000000000..91c9c2ed8 --- /dev/null +++ b/modules/v4d/samples/example_v4d_shader-demo.html @@ -0,0 +1,264 @@ + + + + Mandelbrot Shader Demo + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + diff --git a/modules/v4d/samples/example_v4d_vector_graphics.html b/modules/v4d/samples/example_v4d_vector_graphics.html new file mode 100644 index 000000000..b000315a6 --- /dev/null +++ b/modules/v4d/samples/example_v4d_vector_graphics.html @@ -0,0 +1,210 @@ + + + + Vector Graphics + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_vector_graphics_and_fb.html b/modules/v4d/samples/example_v4d_vector_graphics_and_fb.html new file mode 100644 index 000000000..7295bd512 --- /dev/null +++ b/modules/v4d/samples/example_v4d_vector_graphics_and_fb.html @@ -0,0 +1,210 @@ + + + + Vector Graphics and Frambuffer access + + + + + + + + +
Downloading...
+
+ +
+ +
+ +
+ + + + + diff --git a/modules/v4d/samples/example_v4d_video-demo.html b/modules/v4d/samples/example_v4d_video-demo.html new file mode 100644 index 000000000..c6b59f73c --- /dev/null +++ b/modules/v4d/samples/example_v4d_video-demo.html @@ -0,0 +1,264 @@ + + + + Video Demo + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + diff --git a/modules/v4d/samples/example_v4d_video_editing.html b/modules/v4d/samples/example_v4d_video_editing.html new file mode 100644 index 000000000..235db12dd --- /dev/null +++ b/modules/v4d/samples/example_v4d_video_editing.html @@ -0,0 +1,264 @@ + + + + Video Editing + + + + + + + + + + +
Downloading...
+ +
+ +
+ + +
+ +
+ + + + diff --git a/modules/v4d/samples/font-demo.cpp b/modules/v4d/samples/font-demo.cpp new file mode 100644 index 000000000..7798b52dd --- /dev/null +++ b/modules/v4d/samples/font-demo.cpp @@ -0,0 +1,243 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include + +#include +#include +#include +#include +#include + +using std::string; +using std::vector; +using std::istringstream; + +using namespace cv::v4d; + +class FontDemoPlan : public Plan { + static struct Params { + const cv::Scalar_ INITIAL_COLOR = cv::v4d::colorConvert(cv::Scalar(0.15 * 180.0, 128, 255, 255), cv::COLOR_HLS2RGB); + float minStarSize_ = 0.5f; + float maxStarSize_ = 1.5f; + int minStarCount_ = 1000; + int maxStarCount_ = 3000; + float starAlpha_ = 0.3f; + + float fontSize_ = 0.0f; + cv::Scalar_ textColor_ = INITIAL_COLOR / 255.0; + float warpRatio_ = 1.0f/3.0f; + bool updateStars_ = true; + bool updatePerspective_ = true; + } params_; + + //BGRA + inline static cv::UMat stars_; + cv::UMat warped_; + //transformation matrix + inline static cv::Mat tm_; + + static struct TextVars { + //the text to display + vector lines_; + //global frame count + uint32_t global_cnt_ = 0; + //Total number of lines in the text + int32_t numLines_ = 0; + //Height of the text in pixels + int32_t textHeight_ = 0; + } textVars_; + + //the sequence number of the current frame + uint32_t seqNum_ = 0; + //y-value of the current line + int32_t y_ = 0; + + int32_t translateY_ = 0; + + cv::RNG rng_ = cv::getTickCount(); +public: + using Plan::Plan; + + FontDemoPlan(const cv::Size& sz) : FontDemoPlan(cv::Rect(0, 0, sz.width, sz.height)) { + Global::registerShared(params_); + Global::registerShared(textVars_); + Global::registerShared(tm_); + Global::registerShared(stars_); + } + + FontDemoPlan(const cv::Rect& vp) : Plan(vp) { + } + + void gui(cv::Ptr window) override { + window->imgui([](cv::Ptr win, ImGuiContext* ctx, Params& params){ + CV_UNUSED(win); + using namespace ImGui; + SetCurrentContext(ctx); + Begin("Effect"); + Text("Text Crawl"); + SliderFloat("Font Size", ¶ms.fontSize_, 1.0f, 100.0f); + if(SliderFloat("Warp Ratio", ¶ms.warpRatio_, 0.1f, 1.0f)) + params.updatePerspective_ = true; + ColorPicker4("Text Color", params.textColor_.val); + Text("Stars"); + + if(SliderFloat("Min Star Size", ¶ms.minStarSize_, 0.5f, 1.0f)) + params.updateStars_ = true; + if(SliderFloat("Max Star Size", ¶ms.maxStarSize_, 1.0f, 10.0f)) + params.updateStars_ = true; + if(SliderInt("Min Star Count", ¶ms.minStarCount_, 1, 1000)) + params.updateStars_ = true; + if(SliderInt("Max Star Count", ¶ms.maxStarCount_, 1000, 5000)) + params.updateStars_ = true; + if(SliderFloat("Min Star Alpha", ¶ms.starAlpha_, 0.2f, 1.0f)) + params.updateStars_ = true; + End(); + }, params_); + } + + void setup(cv::Ptr window) override { + window->once([](const cv::Size& sz, TextVars& textVars, Params& params){ + //The text to display + string txt = cv::getBuildInformation(); + //Save the text to a vector + std::istringstream iss(txt); + + int fontSize = hypot(sz.width, sz.height) / 60.0; + { + Global::Scope scope(textVars); + for (std::string line; std::getline(iss, line); ) { + textVars.lines_.push_back(line); + } + textVars.numLines_ = textVars.lines_.size(); + textVars.textHeight_ = (textVars.numLines_ * fontSize); + } + { + Global::Scope scope(params); + params.fontSize_ = fontSize; + } + }, size(), textVars_, params_); + } + + void infer(cv::Ptr window) override { + window->branch(0, isTrue_, params_.updateStars_); + { + window->nvg([](const cv::Size& sz, cv::RNG& rng, const Params& params) { + Params p = Global::safe_copy(params); + using namespace cv::v4d::nvg; + clear(); + + //draw stars + int numStars = rng.uniform(p.minStarCount_, p.maxStarCount_); + for(int i = 0; i < numStars; ++i) { + beginPath(); + const auto& size = rng.uniform(p.minStarSize_, p.maxStarSize_); + strokeWidth(size); + strokeColor(cv::Scalar(255, 255, 255, p.starAlpha_ * 255.0f)); + circle(rng.uniform(0, sz.width) , rng.uniform(0, sz.height), size / 2.0); + stroke(); + } + }, size(), rng_, params_); + + window->fb([](const cv::UMat& framebuffer, const cv::Rect& viewport, cv::UMat& stars, Params& params){ + { + Global::Scope scope(stars); + framebuffer(viewport).copyTo(stars); + } + { + Global::Scope scope(params); + params.updateStars_ = false; + } + }, viewport(), stars_, params_); + } + window->endbranch(0, isTrue_, params_.updateStars_); + + window->branch(0, isTrue_, params_.updatePerspective_); + { + window->plain([](const cv::Size& sz, cv::Mat& tm, Params& params){ + Params p = Global::safe_copy(params); + //Derive the transformation matrix tm for the pseudo 3D effect from quad1 and quad2. + vector quad1 = {cv::Point2f(0,0),cv::Point2f(sz.width,0), + cv::Point2f(sz.width,sz.height),cv::Point2f(0,sz.height)}; + float l = (sz.width - (sz.width * p.warpRatio_)) / 2.0; + float r = sz.width - l; + + vector quad2 = {cv::Point2f(l, 0.0f),cv::Point2f(r, 0.0f), + cv::Point2f(sz.width,sz.height), cv::Point2f(0,sz.height)}; + + Global::Scope scope(tm); + tm = cv::getPerspectiveTransform(quad1, quad2); + }, size(), tm_, params_); + } + window->endbranch(0, isTrue_, params_.updatePerspective_); + + window->branch(always_); + { + window->nvg([](const cv::Size& sz, int32_t& ty, const int32_t& seqNum, int32_t& y, const TextVars& textVars, const Params& params) { + Params p = Global::safe_copy(params); + TextVars txt = Global::safe_copy(textVars); + + //How many pixels to translate the text up. + ty = sz.height - seqNum; + using namespace cv::v4d::nvg; + clear(); + fontSize(p.fontSize_); + fontFace("sans-bold"); + fillColor(p.textColor_ * 255); + textAlign(NVG_ALIGN_CENTER | NVG_ALIGN_TOP); + + /** only draw lines that are visible **/ + translate(0, ty); + + for (size_t i = 0; i < txt.lines_.size(); ++i) { + y = (i * p.fontSize_); + if (y + ty < txt.textHeight_ && y + ty + p.fontSize_ > 0) { + text(sz.width / 2.0, y, txt.lines_[i].c_str(), txt.lines_[i].c_str() + txt.lines_[i].size()); + } + } + }, size(), translateY_, seqNum_, y_, textVars_, params_); + + window->fb([](cv::UMat& framebuffer, const cv::Rect& viewport, cv::UMat& warped, cv::UMat& stars, cv::Mat& tm) { + { + Global::Scope scope(tm); + cv::warpPerspective(framebuffer(viewport), warped, tm, viewport.size(), cv::INTER_LINEAR, cv::BORDER_CONSTANT, cv::Scalar()); + } + { + Global::Scope scope(stars); + cv::add(stars.clone(), warped, framebuffer(viewport)); + } + }, viewport(), warped_, stars_, tm_); + + window->write(); + + window->plain([](const int32_t& translateY, TextVars& textVars, uint32_t& seqNum) { + Global::Scope scope(textVars); + if(-translateY > textVars.textHeight_) { + //reset the scroll once the text is out of the picture + textVars.global_cnt_ = 0; + } + ++textVars.global_cnt_; + //Wrap the cnt around if it becomes to big. + if(textVars.global_cnt_ > std::numeric_limits().max() / 2.0) + textVars.global_cnt_ = 0; + seqNum = textVars.global_cnt_; + }, translateY_, textVars_, seqNum_); + } + window->endbranch(always_); + } +}; + +FontDemoPlan::Params FontDemoPlan::params_; +FontDemoPlan::TextVars FontDemoPlan::textVars_; + +int main() { + cv::Ptr plan = new FontDemoPlan(cv::Size(1280, 720)); + cv::Ptr window = V4D::make(plan->size(), "Font Demo", ALL); + + auto sink = makeWriterSink(window, "font-demo.mkv", 60, plan->size()); + window->setSink(sink); + window->run(plan); + return 0; +} diff --git a/modules/v4d/samples/font_rendering.cpp b/modules/v4d/samples/font_rendering.cpp new file mode 100644 index 000000000..53697baa2 --- /dev/null +++ b/modules/v4d/samples/font_rendering.cpp @@ -0,0 +1,33 @@ +#include + +using namespace cv; +using namespace cv::v4d; + +class FontRenderingPlan: public Plan { + //The text to render + string hw_ = "Hello World"; +public: + FontRenderingPlan(const cv::Size& sz) : Plan(sz) { + } + + void infer(Ptr win) override { + //Render the text at the center of the screen. Note that you can load you own fonts. + win->nvg([](const Size &sz, const string &str) { + using namespace cv::v4d::nvg; + clear(); + fontSize(40.0f); + fontFace("sans-bold"); + fillColor(Scalar(255, 0, 0, 255)); + textAlign(NVG_ALIGN_CENTER | NVG_ALIGN_TOP); + text(sz.width / 2.0, sz.height / 2.0, str.c_str(), + str.c_str() + str.size()); + }, win->fbSize(), hw_); + } +}; + +int main() { + cv::Ptr plan = new FontRenderingPlan(cv::Size(960,960)); + cv::Ptr window = V4D::make(plan->size(), "Font Rendering"); + window->run(plan); +} + diff --git a/modules/v4d/samples/font_with_gui.cpp b/modules/v4d/samples/font_with_gui.cpp new file mode 100644 index 000000000..6ec5406cc --- /dev/null +++ b/modules/v4d/samples/font_with_gui.cpp @@ -0,0 +1,53 @@ +#include + +using namespace cv; +using namespace cv::v4d; + +class FontWithGuiPlan: public Plan { + enum Names { + SIZE, + COLOR + }; + using Params = ThreadSafeMap; + inline static Params params_; + + //The text + string hw_ = "hello world"; +public: + FontWithGuiPlan(const cv::Size& sz) : Plan(sz) { + params_.set(SIZE, 40.0f); + params_.set(COLOR, cv::Scalar_(1.0f, 0.0f, 0.0f, 1.0f)); + } + + void gui(Ptr window) override { + window->imgui([](Ptr win, ImGuiContext* ctx, Params& params) { + CV_UNUSED(win); + using namespace ImGui; + SetCurrentContext(ctx); + Begin("Settings"); + SliderFloat("Font Size", params.ptr(SIZE), 1.0f, 100.0f); + ColorPicker4("Text Color", params.ptr>(COLOR)->val); + End(); + }, params_); + } + + void infer(Ptr window) override { + //Render the text at the center of the screen using parameters from the GUI. + window->nvg([](const Size& sz, const string& str, Params& params) { + using namespace cv::v4d::nvg; + clear(); + fontSize(params.get(SIZE)); + fontFace("sans-bold"); + fillColor(params.get>(COLOR) * 255.0); + textAlign(NVG_ALIGN_CENTER | NVG_ALIGN_TOP); + text(sz.width / 2.0, sz.height / 2.0, str.c_str(), str.c_str() + str.size()); + }, window->fbSize(), hw_, params_); + } +}; + +int main() { + Ptr plan = new FontWithGuiPlan(cv::Size(960,960)); + Ptr window = V4D::make(plan->size(), "Font Rendering with GUI"); + window->run(plan); +} + diff --git a/modules/v4d/samples/fonts/Roboto-Bold.ttf b/modules/v4d/samples/fonts/Roboto-Bold.ttf new file mode 100644 index 000000000..aaf374d2c Binary files /dev/null and b/modules/v4d/samples/fonts/Roboto-Bold.ttf differ diff --git a/modules/v4d/samples/fonts/Roboto-Light.ttf b/modules/v4d/samples/fonts/Roboto-Light.ttf new file mode 100644 index 000000000..664e1b2f9 Binary files /dev/null and b/modules/v4d/samples/fonts/Roboto-Light.ttf differ diff --git a/modules/v4d/samples/fonts/Roboto-Regular.ttf b/modules/v4d/samples/fonts/Roboto-Regular.ttf new file mode 100644 index 000000000..3e6e2e761 Binary files /dev/null and b/modules/v4d/samples/fonts/Roboto-Regular.ttf differ diff --git a/modules/v4d/samples/fonts/entypo.ttf b/modules/v4d/samples/fonts/entypo.ttf new file mode 100644 index 000000000..fc305d2a9 Binary files /dev/null and b/modules/v4d/samples/fonts/entypo.ttf differ diff --git a/modules/v4d/samples/make_example_html.sh b/modules/v4d/samples/make_example_html.sh new file mode 100755 index 000000000..fc1713e7c --- /dev/null +++ b/modules/v4d/samples/make_example_html.sh @@ -0,0 +1,23 @@ +#!/bin/bash + +set -e + +./example_v4d_capture.sh "Beauty Demo" beauty-demo > example_v4d_beauty-demo.html +./example_v4d_nocapture.sh "Cube Demo" cube-demo > example_v4d_cube-demo.html +./example_v4d_nocapture.sh "Custom Source and Sink" custom_source_and_sink > example_v4d_custom_source_and_sink.html +./example_v4d_nocapture.sh "Display an Image through the FB Context" display_image_fb > example_v4d_display_image_fb.html +./example_v4d_nocapture.sh "Display an Image through the Video-Pipeline" display_image > example_v4d_display_image.html +./example_v4d_nocapture.sh "Display an Image through NanoVG" display_image_nvg > example_v4d_display_image_nvg.html +./example_v4d_nocapture.sh "Font Demo" font-demo > example_v4d_font-demo.html +./example_v4d_nocapture.sh "Font rendering with Form-based GUI" font_with_gui > example_v4d_font_with_gui.html +./example_v4d_nocapture.sh "Font rendering" font_rendering > example_v4d_font_rendering.html +./example_v4d_nocapture.sh "Many Cubes Demo" many_cubes-demo > example_v4d_many_cubes-demo.html +./example_v4d_capture.sh "NanoVG Demo" nanovg-demo > example_v4d_nanovg-demo.html +./example_v4d_capture.sh "Sparse Optical Flow Demo" optflow-demo > example_v4d_optflow-demo.html +./example_v4d_capture.sh "Pedestrian Demo" pedestrian-demo > example_v4d_pedestrian-demo.html +./example_v4d_nocapture.sh "Render OpenGL Blue Screen" render_opengl > example_v4d_render_opengl.html +./example_v4d_capture.sh "Mandelbrot Shader Demo" shader-demo > example_v4d_shader-demo.html +./example_v4d_nocapture.sh "Vector Graphics and Frambuffer access" vector_graphics_and_fb > example_v4d_vector_graphics_and_fb.html +./example_v4d_nocapture.sh "Vector Graphics" vector_graphics > example_v4d_vector_graphics.html +./example_v4d_capture.sh "Video Demo" video-demo > example_v4d_video-demo.html +./example_v4d_capture.sh "Video Editing" video_editing > example_v4d_video_editing.html diff --git a/modules/v4d/samples/many_cubes-demo.cpp b/modules/v4d/samples/many_cubes-demo.cpp new file mode 100644 index 000000000..8e1ecb47a --- /dev/null +++ b/modules/v4d/samples/many_cubes-demo.cpp @@ -0,0 +1,269 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include +//adapted from https://gitlab.com/wikibooks-opengl/modern-tutorials/-/blob/master/tut05_cube/cube.cpp + +using namespace cv::v4d; +class ManyCubesDemoPlan : public Plan { +public: + using Plan::Plan; + + /* Demo Parameters */ + constexpr static size_t NUMBER_OF_CUBES_ = 10; + + int glowKernelSize_; + + /* OpenGL constants and variables */ + constexpr static GLuint TRIANGLES_ = 12; + constexpr static GLuint VERTICES_INDEX_ = 0; + constexpr static GLuint COLORS_INDEX_ = 1; + + //Cube vertices, colors and indices + constexpr static float VERTICES_[24] = { + // Front face + 0.5, 0.5, 0.5, -0.5, 0.5, 0.5, -0.5, -0.5, 0.5, 0.5, -0.5, 0.5, + // Back face + 0.5, 0.5, -0.5, -0.5, 0.5, -0.5, -0.5, -0.5, -0.5, 0.5, -0.5, -0.5 + }; + + constexpr static float VERTEX_COLORS_[24] = { + 1.0, 0.4, 0.6, 1.0, 0.9, 0.2, 0.7, 0.3, 0.8, 0.5, 0.3, 1.0, + 0.2, 0.6, 1.0, 0.6, 1.0, 0.4, 0.6, 0.8, 0.8, 0.4, 0.8, 0.8 + }; + + constexpr static unsigned short TRIANGLE_INDICES_[36] = { + // Front + 0, 1, 2, 2, 3, 0, + + // Right + 0, 3, 7, 7, 4, 0, + + // Bottom + 2, 6, 7, 7, 3, 2, + + // Left + 1, 5, 6, 6, 2, 1, + + // Back + 4, 7, 6, 6, 5, 4, + + // Top + 5, 1, 0, 0, 4, 5 + }; +private: + struct Cache { + cv::UMat down_; + cv::UMat up_; + cv::UMat blur_; + cv::UMat dst16_; + } cache_; + GLuint vao_[NUMBER_OF_CUBES_]; + GLuint shaderProgram_[NUMBER_OF_CUBES_]; + GLuint uniformTransform_[NUMBER_OF_CUBES_]; + + //Simple transform & pass-through shaders + static GLuint load_shader() { + //Shader versions "330" and "300 es" are very similar. + //If you are careful you can write the same code for both versions. + #if !defined(OPENCV_V4D_USE_ES3) + const string shaderVersion = "330"; + #else + const string shaderVersion = "300 es"; + #endif + + const string vert = + " #version " + shaderVersion + + R"( + precision lowp float; + layout(location = 0) in vec3 pos; + layout(location = 1) in vec3 vertex_color; + + uniform mat4 transform; + + out vec3 color; + void main() { + gl_Position = transform * vec4(pos, 1.0); + color = vertex_color; + } + )"; + + const string frag = + " #version " + shaderVersion + + R"( + precision lowp float; + in vec3 color; + + out vec4 frag_color; + + void main() { + frag_color = vec4(color, 1.0); + } + )"; + + //Initialize the shaders and returns the program + unsigned int handles[3]; + cv::v4d::initShader(handles, vert.c_str(), frag.c_str(), "fragColor"); + return handles[0]; + } + + //Initializes objects, buffers, shaders and uniforms + static void init_scene(const cv::Size& sz, GLuint& vao, GLuint& shaderProgram, GLuint& uniformTransform) { + glEnable (GL_DEPTH_TEST); + + glGenVertexArrays(1, &vao); + glBindVertexArray(vao); + + unsigned int triangles_ebo; + glGenBuffers(1, &triangles_ebo); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, triangles_ebo); + glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof TRIANGLE_INDICES_, TRIANGLE_INDICES_, + GL_STATIC_DRAW); + + unsigned int verticies_vbo; + glGenBuffers(1, &verticies_vbo); + glBindBuffer(GL_ARRAY_BUFFER, verticies_vbo); + glBufferData(GL_ARRAY_BUFFER, sizeof VERTICES_, VERTICES_, GL_STATIC_DRAW); + + glVertexAttribPointer(VERTICES_INDEX_, 3, GL_FLOAT, GL_FALSE, 0, NULL); + glEnableVertexAttribArray(VERTICES_INDEX_); + + unsigned int colors_vbo; + glGenBuffers(1, &colors_vbo); + glBindBuffer(GL_ARRAY_BUFFER, colors_vbo); + glBufferData(GL_ARRAY_BUFFER, sizeof VERTEX_COLORS_, VERTEX_COLORS_, GL_STATIC_DRAW); + + glVertexAttribPointer(COLORS_INDEX_, 3, GL_FLOAT, GL_FALSE, 0, NULL); + glEnableVertexAttribArray(COLORS_INDEX_); + + glBindVertexArray(0); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0); + glBindBuffer(GL_ARRAY_BUFFER, 0); + + shaderProgram = load_shader(); + uniformTransform = glGetUniformLocation(shaderProgram, "transform"); + glViewport(0,0, sz.width, sz.height); + } + + //Renders a rotating rainbow-colored cube on a blueish background + static void render_scene(const cv::Size& sz, const double& x, const double& y, const double& angleMod, GLuint& vao, GLuint& shaderProgram, GLuint& uniformTransform) { + glViewport(0,0, sz.width, sz.height); + //Use the prepared shader program + glUseProgram(shaderProgram); + + //Scale and rotate the cube depending on the current time. + float angle = fmod(double(cv::getTickCount()) / double(cv::getTickFrequency()) + angleMod, 2 * M_PI); + double scale = 0.25; + cv::Matx44f scaleMat( + scale, 0.0, 0.0, 0.0, + 0.0, scale, 0.0, 0.0, + 0.0, 0.0, scale, 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotXMat( + 1.0, 0.0, 0.0, 0.0, + 0.0, cos(angle), -sin(angle), 0.0, + 0.0, sin(angle), cos(angle), 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotYMat( + cos(angle), 0.0, sin(angle), 0.0, + 0.0, 1.0, 0.0, 0.0, + -sin(angle), 0.0,cos(angle), 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotZMat( + cos(angle), -sin(angle), 0.0, 0.0, + sin(angle), cos(angle), 0.0, 0.0, + 0.0, 0.0, 1.0, 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f translateMat( + 1.0, 0.0, 0.0, 0.0, + 0.0, 1.0, 0.0, 0.0, + 0.0, 0.0, 1.0, 0.0, + x, y, 0.0, 1.0); + + //calculate the transform + cv::Matx44f transform = scaleMat * rotXMat * rotYMat * rotZMat * translateMat; + //set the corresponding uniform + glUniformMatrix4fv(uniformTransform, 1, GL_FALSE, transform.val); + //Bind our vertex array + glBindVertexArray(vao); + //Draw + glDrawElements(GL_TRIANGLES, TRIANGLES_ * 3, GL_UNSIGNED_SHORT, NULL); + } + + //applies a glow effect to an image + static void glow_effect(const cv::UMat& src, cv::UMat& dst, const int ksize, Cache& cache) { + cv::bitwise_not(src, dst); + + //Resize for some extra performance + cv::resize(dst, cache.down_, cv::Size(), 0.5, 0.5); + //Cheap blur + cv::boxFilter(cache.down_, cache.blur_, -1, cv::Size(ksize, ksize), cv::Point(-1, -1), true, + cv::BORDER_REPLICATE); + //Back to original size + cv::resize(cache.blur_, cache.up_, src.size()); + + //Multiply the src image with a blurred version of itself + cv::multiply(dst, cache.up_, cache.dst16_, 1, CV_16U); + //Normalize and convert back to CV_8U + cv::divide(cache.dst16_, cv::Scalar::all(255.0), dst, 1, CV_8U); + + cv::bitwise_not(dst, dst); + } + +public: + void setup(cv::Ptr window) override { + int diag = hypot(double(size().width), double(size().height)); + glowKernelSize_ = std::max(int(diag / 138 % 2 == 0 ? diag / 138 + 1 : diag / 138), 1); + + for(size_t i = 0; i < NUMBER_OF_CUBES_; ++i) { + window->gl(i, [](const size_t& ctxIdx, const cv::Size& sz, GLuint& vao, GLuint& shader, GLuint& uniformTrans){ + CV_UNUSED(ctxIdx); + init_scene(sz, vao, shader, uniformTrans); + }, size(), vao_[i], shaderProgram_[i], uniformTransform_[i]); + } + } + + void infer(cv::Ptr window) override { + window->gl([](){ + //Clear the background + glClearColor(0.2, 0.24, 0.4, 1); + glClear(GL_COLOR_BUFFER_BIT); + }); + + //Render using multiple OpenGL contexts + for(size_t i = 0; i < NUMBER_OF_CUBES_; ++i) { + window->gl(i, [](const int32_t& ctxIdx, const cv::Size& sz, GLuint& vao, GLuint& shader, GLuint& uniformTrans){ + double x = sin((double(ctxIdx) / NUMBER_OF_CUBES_) * 2 * M_PI) / 1.5; + double y = cos((double(ctxIdx) / NUMBER_OF_CUBES_) * 2 * M_PI) / 1.5; + double angle = sin((double(ctxIdx) / NUMBER_OF_CUBES_) * 2 * M_PI); + render_scene(sz, x, y, angle, vao, shader, uniformTrans); + }, size(), vao_[i], shaderProgram_[i], uniformTransform_[i]); + } + + //Aquire the frame buffer for use by OpenCV + window->fb([](cv::UMat& framebuffer, const cv::Rect& viewport, int glowKernelSize, Cache& cache) { + cv::UMat roi = framebuffer(viewport); + glow_effect(roi, roi, glowKernelSize, cache); + }, viewport(), glowKernelSize_, cache_); + + window->write(); + } +}; + +int main() { + cv::Ptr plan = new ManyCubesDemoPlan(cv::Size(1280, 720)); + cv::Ptr window = V4D::make(plan->size(), "Many Cubes Demo", IMGUI); + + //Creates a writer sink (which might be hardware accelerated) + auto sink = makeWriterSink(window, "many_cubes-demo.mkv", 60, plan->size()); + window->setSink(sink); + window->run(plan, 1); + + return 0; +} diff --git a/modules/v4d/samples/montage-demo.cpp b/modules/v4d/samples/montage-demo.cpp new file mode 100644 index 000000000..ddc555871 --- /dev/null +++ b/modules/v4d/samples/montage-demo.cpp @@ -0,0 +1,137 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +int v4d_cube_main(); +int v4d_many_cubes_main(); +int v4d_video_main(int argc, char **argv); +int v4d_nanovg_main(int argc, char **argv); +int v4d_shader_main(int argc, char **argv); +int v4d_font_main(); +int v4d_pedestrian_main(int argc, char **argv); +int v4d_optflow_main(int argc, char **argv); +int v4d_beauty_main(int argc, char **argv); +#define main v4d_cube_main +#include "cube-demo.cpp" +#undef main +#define main v4d_many_cubes_main +#include "many_cubes-demo.cpp" +#undef main +#define main v4d_video_main +#include "video-demo.cpp" +#undef main +#define main v4d_nanovg_main +#include "nanovg-demo.cpp" +#undef main +#define main v4d_shader_main +#include "shader-demo.cpp" +#undef main +#define main v4d_font_main +#include "font-demo.cpp" +#undef main +#define main v4d_pedestrian_main +#include "pedestrian-demo.cpp" +#undef main +#define main v4d_optflow_main +#include "optflow-demo.cpp" +#undef main +#define main v4d_beauty_main +#include "beauty-demo.cpp" +#undef main + +class MontageDemoPlan : public Plan { + const cv::Size tiling_ = cv::Size(3, 3); + const cv::Size tileSz_ = cv::Size(640, 360); + const cv::Rect viewport_ = cv::Rect(0, 720, 640, 360); + + std::vector plans_ = { + new CubeDemoPlan(viewport_), + new ManyCubesDemoPlan(viewport_), + new VideoDemoPlan(viewport_), + new NanoVGDemoPlan(viewport_), + new ShaderDemoPlan(viewport_), + new FontDemoPlan(viewport_), + new PedestrianDemoPlan(viewport_), + new BeautyDemoPlan(viewport_), + new OptflowDemoPlan(viewport_) + }; + struct Frames { + std::vector results_ = std::vector(9); + cv::UMat captured; + } frames_; + + cv::Size_ scale_; +public: + MontageDemoPlan(const cv::Size& sz) : Plan(sz) { + CV_Assert(plans_.size() == frames_.results_.size() && plans_.size() == size_t(tiling_.width * tiling_.height)); + scale_ = cv::Size_(float(size().width) / tileSz_.width, float(size().height) / tileSz_.height); + } + + virtual void setup(cv::Ptr window) override { + for(auto* plan : plans_) { + plan->setup(window); + } + } + + virtual void infer(cv::Ptr window) override { + window->nvgCtx()->setScale(scale_); + window->capture(); + window->setDisableIO(true); + window->fb([](cv::UMat& framebuffer, const cv::Size& tileSize, cv::UMat& captured){ + cv::resize(framebuffer, captured, tileSize); + }, tileSz_, frames_.captured); + + + for(size_t i = 0; i < plans_.size(); ++i) { + auto* plan = plans_[i]; + window->fb([](cv::UMat& framebuffer, const cv::Size& tileSize, const cv::UMat& captured){ + framebuffer = cv::Scalar::all(0); + captured.copyTo(framebuffer(cv::Rect(0, tileSize.height * 2, tileSize.width, tileSize.height))); + }, tileSz_, frames_.captured); + plan->infer(window); + window->fb([](const cv::UMat& framebuffer, cv::UMat& result){ + framebuffer.copyTo(result); + }, frames_.results_[i]); + } + + window->fb([](cv::UMat& framebuffer, const cv::Size& tileSz, const Frames& frames){ + int w = tileSz.width; + int h = tileSz.height; + framebuffer = cv::Scalar::all(0); + + for(size_t x = 0; x < 3; ++x) + for(size_t y = 0; y < 3; ++y) + frames.results_[x * 3 + y](cv::Rect(0, h * 2, w, h)).copyTo(framebuffer(cv::Rect(w * x, h * y, w, h))); + }, tileSz_, frames_); + + window->setDisableIO(false); + window->write(); + } + + virtual void teardown(cv::Ptr window) override { + for(auto* plan : plans_) { + plan->teardown(window); + } + } +}; + +int main(int argc, char** argv) { + if (argc != 3) { + cerr << "Usage: montage-demo " << endl; + exit(1); + } + + cv::Ptr plan = new MontageDemoPlan(cv::Size(1920, 1080)); + cv::Ptr window = V4D::make(plan->size(), "Montage Demo", ALL); + //Creates a source from a file or a device + auto src = makeCaptureSource(window, argv[1]); + window->setSource(src); + //Creates a writer sink (which might be hardware accelerated) + auto sink = makeWriterSink(window, "montage-demo.mkv", 60, plan->size()); + window->setSink(sink); + window->run(plan, atoi(argv[2])); + + return 0; +} + diff --git a/modules/v4d/samples/nanovg-demo.cpp b/modules/v4d/samples/nanovg-demo.cpp new file mode 100644 index 000000000..8a50b9396 --- /dev/null +++ b/modules/v4d/samples/nanovg-demo.cpp @@ -0,0 +1,195 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include + +static void draw_color_wheel(float x, float y, float w, float h, double hue) { + //color wheel drawing code taken from https://github.com/memononen/nanovg/blob/master/example/demo.c + using namespace cv::v4d::nvg; + int i; + float r0, r1, ax, ay, bx, by, cx, cy, aeps, r; + Paint paint; + + save(); + + cx = x + w * 0.5f; + cy = y + h * 0.5f; + r1 = (w < h ? w : h) * 0.5f - 5.0f; + r0 = r1 - 20.0f; + aeps = 0.5f / r1; // half a pixel arc length in radians (2pi cancels out). + + for (i = 0; i < 6; i++) { + float a0 = (float) i / 6.0f * CV_PI * 2.0f - aeps; + float a1 = (float) (i + 1.0f) / 6.0f * CV_PI * 2.0f + aeps; + beginPath(); + arc(cx, cy, r0, a0, a1, NVG_CW); + arc(cx, cy, r1, a1, a0, NVG_CCW); + closePath(); + ax = cx + cosf(a0) * (r0 + r1) * 0.5f; + ay = cy + sinf(a0) * (r0 + r1) * 0.5f; + bx = cx + cosf(a1) * (r0 + r1) * 0.5f; + by = cy + sinf(a1) * (r0 + r1) * 0.5f; + paint = linearGradient(ax, ay, bx, by, + cv::v4d::colorConvert(cv::Scalar((a0 / (CV_PI * 2.0)) * 180.0, 0.55 * 255.0, 255.0, 255.0), cv::COLOR_HLS2BGR), + cv::v4d::colorConvert(cv::Scalar((a1 / (CV_PI * 2.0)) * 180.0, 0.55 * 255, 255, 255), cv::COLOR_HLS2BGR)); + fillPaint(paint); + fill(); + } + + beginPath(); + circle(cx, cy, r0 - 0.5f); + circle(cx, cy, r1 + 0.5f); + strokeColor(cv::Scalar(0, 0, 0, 64)); + strokeWidth(1.0f); + stroke(); + + // Selector + save(); + translate(cx, cy); + rotate((hue/255.0) * CV_PI * 2); + + // Marker on + strokeWidth(2.0f); + beginPath(); + rect(r0 - 1, -3, r1 - r0 + 2, 6); + strokeColor(cv::Scalar(255, 255, 255, 192)); + stroke(); + + paint = boxGradient(r0 - 3, -5, r1 - r0 + 6, 10, 2, 4, cv::Scalar(0, 0, 0, 128), cv::Scalar(0, 0, 0, 0)); + beginPath(); + rect(r0 - 2 - 10, -4 - 10, r1 - r0 + 4 + 20, 8 + 20); + rect(r0 - 2, -4, r1 - r0 + 4, 8); + pathWinding(NVG_HOLE); + fillPaint(paint); + fill(); + + // Center triangle + r = r0 - 6; + ax = cosf(120.0f / 180.0f * NVG_PI) * r; + ay = sinf(120.0f / 180.0f * NVG_PI) * r; + bx = cosf(-120.0f / 180.0f * NVG_PI) * r; + by = sinf(-120.0f / 180.0f * NVG_PI) * r; + beginPath(); + moveTo(r, 0); + lineTo(ax, ay); + lineTo(bx, by); + closePath(); + paint = linearGradient(r, 0, ax, ay, cv::v4d::colorConvert(cv::Scalar(hue, 128.0, 255.0, 255.0), cv::COLOR_HLS2BGR_FULL), cv::Scalar(255, 255, 255, 255)); + fillPaint(paint); + fill(); + paint = linearGradient((r + ax) * 0.5f, (0 + ay) * 0.5f, bx, by, cv::Scalar(0, 0, 0, 0), cv::Scalar(0, 0, 0, 255)); + fillPaint(paint); + fill(); + strokeColor(cv::Scalar(0, 0, 0, 64)); + stroke(); + + // Select circle on triangle + ax = cosf(120.0f / 180.0f * NVG_PI) * r * 0.3f; + ay = sinf(120.0f / 180.0f * NVG_PI) * r * 0.4f; + strokeWidth(2.0f); + beginPath(); + circle(ax, ay, 5); + strokeColor(cv::Scalar(255, 255, 255, 192)); + stroke(); + + paint = radialGradient(ax, ay, 7, 9, cv::Scalar(0, 0, 0, 64), cv::Scalar(0, 0, 0, 0)); + beginPath(); + rect(ax - 20, ay - 20, 40, 40); + circle(ax, ay, 7); + pathWinding(NVG_HOLE); + fillPaint(paint); + fill(); + + restore(); + + restore(); +} + +using namespace cv::v4d; + +class NanoVGDemoPlan : public Plan { + std::vector hsvChannels_; + cv::UMat rgb_; + cv::UMat bgra_; + cv::UMat hsv_; + cv::UMat hueChannel_; + inline static long cnt_ = 0; + double hue_ = 0; +public: + using Plan::Plan; + + NanoVGDemoPlan(const cv::Rect& vp) : Plan(vp) { + Global::registerShared(cnt_); + } + + NanoVGDemoPlan(const cv::Size& sz) : NanoVGDemoPlan(cv::Rect(0, 0, sz.width, sz.height)) { + } + + void infer(cv::Ptr window) override { + window->plain([](long& cnt, double& hue){ + long c; + Global::lock(cnt); + //we use frame count to calculate the current hue + double t = ++c / 60.0; + //nanovg hue fading depending on t + hue = (sinf(t * 0.12) + 1.0) * 127.5; + Global::unlock(cnt); + }, cnt_, hue_); + + window->capture(); + + //Acquire the framebuffer and convert it to RGB + window->fb([](const cv::UMat &framebuffer, const cv::Rect& viewport, cv::UMat& rgb) { + cvtColor(framebuffer(viewport), rgb, cv::COLOR_BGRA2RGB); + }, viewport(), rgb_); + + window->plain([](cv::UMat& rgb, cv::UMat& hsv, std::vector& hsvChannels, double& hue){ + //Color-conversion from RGB to HSV + cv::cvtColor(rgb, hsv, cv::COLOR_RGB2HSV_FULL); + + //Split the channels + split(hsv,hsvChannels); + //Set the current hue + hsvChannels[0].setTo(std::round(hue)); + //Merge the channels back + merge(hsvChannels,hsv); + + //Color-conversion from HSV to RGB + cv::cvtColor(hsv, rgb, cv::COLOR_HSV2RGB_FULL); + }, rgb_, hsv_, hsvChannels_, hue_); + + //Acquire the framebuffer and convert the rgb_ into it + window->fb([](cv::UMat &framebuffer, const cv::Rect& viewport, const cv::UMat& rgb) { + cv::cvtColor(rgb, framebuffer(viewport), cv::COLOR_BGR2BGRA); + }, viewport(), rgb_); + + //Render using nanovg + window->nvg([](const cv::Size &sz, const double& h) { + draw_color_wheel(sz.width - (sz.width / 5), sz.height - (sz.width / 5), sz.width / 6, sz.width / 6, h); + }, size(), hue_); + + window->write(); + } +}; + +int main(int argc, char **argv) { + if (argc != 2) { + cerr << "Usage: nanovg-demo " << endl; + exit(1); + } + + cv::Ptr plan = new NanoVGDemoPlan(cv::Size(1280, 960)); + cv::Ptr window = V4D::make(plan->size(), "NanoVG Demo", NANOVG); + window->printSystemInfo(); + + auto src = makeCaptureSource(window, argv[1]); + auto sink = makeWriterSink(window, "nanovg-demo.mkv", src->fps(), plan->size()); + window->setSource(src); + window->setSink(sink); + + window->run(plan); + + return 0; +} diff --git a/modules/v4d/samples/optflow-demo.cpp b/modules/v4d/samples/optflow-demo.cpp new file mode 100644 index 000000000..b7c15fd9a --- /dev/null +++ b/modules/v4d/samples/optflow-demo.cpp @@ -0,0 +1,472 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include + +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +using std::vector; +using std::string; + +using namespace cv::v4d; + +class OptflowDemoPlan : public Plan { +public: + using Plan::Plan; +private: + //How the background will be visualized + enum BackgroundModes { + GREY, + COLOR, + VALUE, + BLACK + }; + + //Post-processing modes for the foreground + enum PostProcModes { + GLOW, + BLOOM, + DISABLED + }; + + static struct Params { + // Generate the foreground at this scale. + float fgScale_ = 0.5f; + // On every frame the foreground loses on brightness. Specifies the loss in percent. + float fgLoss_ = 1; + //Convert the background to greyscale + BackgroundModes backgroundMode_ = GREY; + // Peak thresholds for the scene change detection. Lowering them makes the detection more sensitive but + // the default should be fine. + float sceneChangeThresh_ = 0.29f; + float sceneChangeThreshDiff_ = 0.1f; + // The theoretical maximum number of points to track which is scaled by the density of detected points + // and therefor is usually much smaller. + int maxPoints_ = 300000; + // How many of the tracked points to lose intentionally, in percent. + float pointLoss_ = 20; + // The theoretical maximum size of the drawing stroke which is scaled by the area of the convex hull + // of tracked points and therefor is usually much smaller. + int maxStroke_ = 6; + // Blue, green, red and alpha. All from 0.0f to 1.0f + cv::Scalar_ effectColor_ = {0.4f, 0.75f, 1.0f, 0.15f}; + //display on-screen FPS + bool showFps_ = true; + //Stretch frame buffer to window size + bool stretch_ = false; + //The post processing mode + PostProcModes postProcMode_ = GLOW; + // Intensity of glow or bloom defined by kernel size. The default scales with the image diagonal. + int glowKernelSize_ = 0; + //The lightness selection threshold + int bloomThresh_ = 210; + //The intensity of the bloom filter + float bloomGain_ = 3; + } params_; + + struct Cache { + cv::Mat element_ = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(3, 3), cv::Point(1, 1)); + + vector tmpKeyPoints_; + + float last_movement_ = 0; + + vector hull_, prevPoints_, nextPoints_, newPoints_; + vector upPrevPoints_, upNextPoints_; + std::vector status_; + std::vector err_; + std::random_device rd_; + std::mt19937 rng_; + + cv::UMat bgr_; + cv::UMat hls_; + cv::UMat ls16_; + cv::UMat ls_; + cv::UMat bblur_; + std::vector hlsChannels_; + + cv::UMat high_; + cv::UMat low_; + cv::UMat gblur_; + cv::UMat dst16_; + + cv::UMat tmp_; + cv::UMat post_; + cv::UMat backgroundGrey_; + vector channels_; + cv::UMat localFg_; + } cache_; + + //BGRA + cv::UMat background_, down_, frame_; + inline static cv::UMat foreground_; + //BGR + cv::UMat result_; + //GREY + cv::UMat downPrevGrey_, downNextGrey_, downMotionMaskGrey_; + vector detectedPoints_; + + cv::Ptr bg_subtractor_ = cv::createBackgroundSubtractorMOG2(100, 16.0, false); + cv::Ptr detector_ = cv::FastFeatureDetector::create(1, false); + + //Uses background subtraction to generate a "motion mask" + static void prepare_motion_mask(const cv::UMat& srcGrey, cv::UMat& motionMaskGrey, cv::Ptr bg_subtractor, Cache& cache) { + bg_subtractor->apply(srcGrey, motionMaskGrey); + //Surpress speckles + cv::morphologyEx(motionMaskGrey, motionMaskGrey, cv::MORPH_OPEN, cache.element_, cv::Point(cache.element_.cols >> 1, cache.element_.rows >> 1), 2, cv::BORDER_CONSTANT, cv::morphologyDefaultBorderValue()); + } + + //Detect points to track + static void detect_points(const cv::UMat& srcMotionMaskGrey, vector& points, cv::Ptr detector, Cache& cache) { + detector->detect(srcMotionMaskGrey, cache.tmpKeyPoints_); + + points.clear(); + for (const auto &kp : cache.tmpKeyPoints_) { + points.push_back(kp.pt); + } + } + + //Detect extrem changes in scene content and report it + static bool detect_scene_change(const cv::UMat& srcMotionMaskGrey, const Params& params, Cache& cache) { + float movement = cv::countNonZero(srcMotionMaskGrey) / float(srcMotionMaskGrey.cols * srcMotionMaskGrey.rows); + float relation = movement > 0 && cache.last_movement_ > 0 ? std::max(movement, cache.last_movement_) / std::min(movement, cache.last_movement_) : 0; + float relM = relation * log10(1.0f + (movement * 9.0)); + float relLM = relation * log10(1.0f + (cache.last_movement_ * 9.0)); + + bool result = !((movement > 0 && cache.last_movement_ > 0 && relation > 0) + && (relM < params.sceneChangeThresh_ && relLM < params.sceneChangeThresh_ && fabs(relM - relLM) < params.sceneChangeThreshDiff_)); + cache.last_movement_ = (cache.last_movement_ + movement) / 2.0f; + return result; + } + + //Visualize the sparse optical flow + static void visualize_sparse_optical_flow(const cv::UMat &prevGrey, const cv::UMat &nextGrey, const vector &detectedPoints, const Params& params, Cache& cache) { + //less then 5 points is a degenerate case (e.g. the corners of a video frame) + if (detectedPoints.size() > 4) { + cv::convexHull(detectedPoints, cache.hull_); + float area = cv::contourArea(cache.hull_); + //make sure the area of the point cloud is positive + if (area > 0) { + float density = (detectedPoints.size() / area); + //stroke size is biased by the area of the point cloud + float strokeSize = params.maxStroke_ * pow(area / (nextGrey.cols * nextGrey.rows), 0.33f); + //max points is biased by the densitiy of the point cloud + size_t currentMaxPoints = ceil(density * params.maxPoints_); + + //lose a number of random points specified by pointLossPercent + std::shuffle(cache.prevPoints_.begin(), cache.prevPoints_.end(), cache.rng_); + cache.prevPoints_.resize(ceil(cache.prevPoints_.size() * (1.0f - (params.pointLoss_ / 100.0f)))); + + //calculate how many newly detected points to add + size_t copyn = std::min(detectedPoints.size(), (size_t(std::ceil(currentMaxPoints)) - cache.prevPoints_.size())); + if (cache.prevPoints_.size() < currentMaxPoints) { + std::copy(detectedPoints.begin(), detectedPoints.begin() + copyn, std::back_inserter(cache.prevPoints_)); + } + + //calculate the sparse optical flow + cv::calcOpticalFlowPyrLK(prevGrey, nextGrey, cache.prevPoints_, cache.nextPoints_, cache.status_, cache.err_); + cache.newPoints_.clear(); + if (cache.prevPoints_.size() > 1 && cache.nextPoints_.size() > 1) { + //scale the points to original size + cache.upNextPoints_.clear(); + cache.upPrevPoints_.clear(); + for (cv::Point2f pt : cache.prevPoints_) { + cache.upPrevPoints_.push_back(pt /= params.fgScale_); + } + + for (cv::Point2f pt : cache.nextPoints_) { + cache.upNextPoints_.push_back(pt /= params.fgScale_); + } + + using namespace cv::v4d::nvg; + //start drawing + beginPath(); + strokeWidth(strokeSize); + strokeColor(params.effectColor_ * 255.0); + + for (size_t i = 0; i < cache.prevPoints_.size(); i++) { + if (cache.status_[i] == 1 //point was found in prev and new set + && cache.err_[i] < (1.0 / density) //with a higher density be more sensitive to the feature error + && cache.upNextPoints_[i].y >= 0 && cache.upNextPoints_[i].x >= 0 //check bounds + && cache.upNextPoints_[i].y < nextGrey.rows / params.fgScale_ && cache.upNextPoints_[i].x < nextGrey.cols / params.fgScale_ //check bounds + ) { + float len = hypot(fabs(cache.upPrevPoints_[i].x - cache.upNextPoints_[i].x), fabs(cache.upPrevPoints_[i].y - cache.upNextPoints_[i].y)); + //upper and lower bound of the flow vector length + if (len > 0 && len < sqrt(area)) { + //collect new points + cache.newPoints_.push_back(cache.nextPoints_[i]); + //the actual drawing operations + moveTo(cache.upNextPoints_[i].x, cache.upNextPoints_[i].y); + lineTo(cache.upPrevPoints_[i].x, cache.upPrevPoints_[i].y); + } + } + } + //end drawing + stroke(); + } + cache.prevPoints_ = cache.newPoints_; + } + } + } + + //Bloom post-processing effect + static void bloom(const cv::UMat& src, cv::UMat &dst, Cache& cache, int ksize = 3, int threshValue = 235, float gain = 4) { + //remove alpha channel + cv::cvtColor(src, cache.bgr_, cv::COLOR_BGRA2RGB); + //convert to hls + cv::cvtColor(cache.bgr_, cache.hls_, cv::COLOR_BGR2HLS); + //split channels + cv::split(cache.hls_, cache.hlsChannels_); + //invert lightness + cv::bitwise_not(cache.hlsChannels_[2], cache.hlsChannels_[2]); + //multiply lightness and saturation + cv::multiply(cache.hlsChannels_[1], cache.hlsChannels_[2], cache.ls16_, 1, CV_16U); + //normalize + cv::divide(cache.ls16_, cv::Scalar(255.0), cache.ls_, 1, CV_8U); + //binary threhold according to threshValue + cv::threshold(cache.ls_, cache.bblur_, threshValue, 255, cv::THRESH_BINARY); + //blur + cv::boxFilter(cache.bblur_, cache.bblur_, -1, cv::Size(ksize, ksize), cv::Point(-1,-1), true, cv::BORDER_REPLICATE); + //convert to BGRA + cv::cvtColor(cache.bblur_, cache.bblur_, cv::COLOR_GRAY2BGRA); + //add src and the blurred L-S-product according to gain + addWeighted(src, 1.0, cache.bblur_, gain, 0, dst); + } + + //Glow post-processing effect + static void glow_effect(const cv::UMat &src, cv::UMat &dst, const int ksize, Cache& cache) { + cv::bitwise_not(src, dst); + + //Resize for some extra performance + cv::resize(dst, cache.low_, cv::Size(), 0.5, 0.5); + //Cheap blur + cv::boxFilter(cache.low_, cache.gblur_, -1, cv::Size(ksize, ksize), cv::Point(-1,-1), true, cv::BORDER_REPLICATE); + //Back to original size + cv::resize(cache.gblur_, cache.high_, src.size()); + + //Multiply the src image with a blurred version of itself + cv::multiply(dst, cache.high_, cache.dst16_, 1, CV_16U); + //Normalize and convert back to CV_8U + cv::divide(cache.dst16_, cv::Scalar::all(255.0), dst, 1, CV_8U); + + cv::bitwise_not(dst, dst); + } + + //Compose the different layers into the final image + static void composite_layers(cv::UMat& background, cv::UMat& foreground, const cv::UMat& frameBuffer, cv::UMat& dst, const Params& params, Cache& cache) { + //Lose a bit of foreground brightness based on fgLossPercent + cv::subtract(foreground, cv::Scalar::all(255.0f * (params.fgLoss_ / 100.0f)), foreground); + //Add foreground an the current framebuffer into foregound + cv::add(foreground, frameBuffer, foreground); + + //Dependin on bgMode prepare the background in different ways + switch (params.backgroundMode_) { + case GREY: + cv::cvtColor(background, cache.backgroundGrey_, cv::COLOR_BGRA2GRAY); + cv::cvtColor(cache.backgroundGrey_, background, cv::COLOR_GRAY2BGRA); + break; + case VALUE: + cv::cvtColor(background, cache.tmp_, cv::COLOR_BGRA2BGR); + cv::cvtColor(cache.tmp_, cache.tmp_, cv::COLOR_BGR2HSV); + split(cache.tmp_, cache.channels_); + cv::cvtColor(cache.channels_[2], background, cv::COLOR_GRAY2BGRA); + break; + case COLOR: + break; + case BLACK: + background = cv::Scalar::all(0); + break; + default: + break; + } + + //Depending on ppMode perform post-processing + switch (params.postProcMode_) { + case GLOW: + glow_effect(foreground, cache.post_, params.glowKernelSize_, cache); + break; + case BLOOM: + bloom(foreground, cache.post_, cache, params.glowKernelSize_, params.bloomThresh_, params.bloomGain_); + break; + case DISABLED: + foreground.copyTo(cache.post_); + break; + default: + break; + } + + //Add background and post-processed foreground into dst + cv::add(background, cache.post_, dst); + } +public: + OptflowDemoPlan(const cv::Rect& viewport) : Plan(viewport) { + Global::registerShared(params_); + Global::registerShared(foreground_); + } + + OptflowDemoPlan(const cv::Size& sz) : OptflowDemoPlan(cv::Rect(0,0, sz.width, sz.height)) { + } + + virtual void gui(cv::Ptr window) override { + window->imgui([](cv::Ptr win, ImGuiContext* ctx, Params& params){ + using namespace ImGui; + SetCurrentContext(ctx); + + Begin("Effects"); + Text("Foreground"); + SliderFloat("Scale", ¶ms.fgScale_, 0.1f, 4.0f); + SliderFloat("Loss", ¶ms.fgLoss_, 0.1f, 99.9f); + Text("Background"); + thread_local const char* bgm_items[4] = {"Grey", "Color", "Value", "Black"}; + thread_local int* bgm = (int*)¶ms.backgroundMode_; + ListBox("Mode", bgm, bgm_items, 4, 4); + Text("Points"); + SliderInt("Max. Points", ¶ms.maxPoints_, 10, 1000000); + SliderFloat("Point Loss", ¶ms.pointLoss_, 0.0f, 100.0f); + Text("Optical flow"); + SliderInt("Max. Stroke Size", ¶ms.maxStroke_, 1, 100); + ColorPicker4("Color", params.effectColor_.val); + End(); + + Begin("Post Processing"); + thread_local const char* ppm_items[3] = {"Glow", "Bloom", "None"}; + thread_local int* ppm = (int*)¶ms.postProcMode_; + ListBox("Effect",ppm, ppm_items, 3, 3); + SliderInt("Kernel Size",¶ms.glowKernelSize_, 1, 63); + SliderFloat("Gain", ¶ms.bloomGain_, 0.1f, 20.0f); + End(); + + Begin("Settings"); + Text("Scene Change Detection"); + SliderFloat("Threshold", ¶ms.sceneChangeThresh_, 0.1f, 1.0f); + SliderFloat("Threshold Diff", ¶ms.sceneChangeThreshDiff_, 0.1f, 1.0f); + End(); + + Begin("Window"); + if(Checkbox("Show FPS", ¶ms.showFps_)) { + win->setShowFPS(params.showFps_); + } + if(Checkbox("Stretch", ¶ms.stretch_)) { + win->setStretching(params.stretch_); + } + + if(Button("Fullscreen")) { + win->setFullscreen(!win->isFullscreen()); + }; + + if(Button("Offscreen")) { + win->setVisible(!win->isVisible()); + }; + + End(); + }, params_); + } + + virtual void setup(cv::Ptr window) override { + cache_.rng_ = std::mt19937(cache_.rd_()); + window->setStretching(params_.stretch_); + window->once([](const cv::Size& sz, Params& params, cv::UMat& foreground){ + int diag = hypot(double(sz.width), double(sz.height)); + params.glowKernelSize_ = std::max(int(diag / 150 % 2 == 0 ? diag / 150 + 1 : diag / 150), 1); + params.effectColor_[3] /= (Global::workers_started() - 1); + foreground.create(sz, CV_8UC4); + foreground = cv::Scalar::all(0); + }, size(), params_, foreground_); + } + + virtual void infer(cv::Ptr window) override { + window->capture(); + + window->fb([](const cv::UMat& framebuffer, const cv::Rect& viewport, cv::UMat& frame) { + framebuffer(viewport).copyTo(frame); + }, viewport(), frame_); + + window->plain([](const cv::UMat& frame, cv::UMat& background) { + frame.copyTo(background); + }, frame_, background_); + + window->fb([](const cv::UMat& framebuffer, const cv::Rect& viewport, cv::UMat& d, cv::UMat& b, const Params& params) { + Params p = Global::safe_copy(params); + //resize to foreground scale + cv::resize(framebuffer(viewport), d, cv::Size(viewport.width * p.fgScale_, viewport.height * p.fgScale_)); + //save video background + framebuffer(viewport).copyTo(b); + }, viewport(), down_, background_, params_); + + window->plain([](const cv::UMat& d, cv::UMat& dng, cv::UMat& dmmg, std::vector& dp, cv::Ptr& bg_subtractor, cv::Ptr& detector, Cache& cache){ + cv::cvtColor(d, dng, cv::COLOR_RGBA2GRAY); + //Subtract the background to create a motion mask + prepare_motion_mask(dng, dmmg, bg_subtractor, cache); + //Detect trackable points in the motion mask + detect_points(dmmg, dp, detector, cache); + }, down_, downNextGrey_, downMotionMaskGrey_, detectedPoints_, bg_subtractor_, detector_, cache_); + + window->nvg([](const cv::UMat& dmmg, const cv::UMat& dpg, const cv::UMat& dng, const std::vector& dp, const Params& params, Cache& cache) { + const Params p = Global::safe_copy(params); + cv::v4d::nvg::clear(); + if (!dpg.empty()) { + //We don't want the algorithm to get out of hand when there is a scene change, so we suppress it when we detect one. + if (!detect_scene_change(dmmg, p, cache)) { + //Visualize the sparse optical flow using nanovg + visualize_sparse_optical_flow(dpg, dng, dp, p, cache); + } + } + }, downMotionMaskGrey_, downPrevGrey_, downNextGrey_, detectedPoints_, params_, cache_); + + window->plain([](cv::UMat& dpg, const cv::UMat& dng) { + dpg = dng.clone(); + }, downPrevGrey_, downNextGrey_); + + window->fb([](const cv::UMat& framebuffer, const cv::Rect& viewport, cv::UMat& frame) { + framebuffer(viewport).copyTo(frame); + }, viewport(), frame_); + + window->plain([](cv::UMat& frame, cv::UMat& background, cv::UMat& foreground, const Params& params, Cache& cache) { + //Put it all together (OpenCL) + Global::Scope scope(foreground); + copy_shared(foreground, cache.localFg_); + composite_layers(background, cache.localFg_, frame, frame, params, cache); + copy_shared(cache.localFg_, foreground); + }, frame_, background_, foreground_, params_, cache_); + + window->fb([](cv::UMat& framebuffer, const cv::Rect& viewport, const cv::UMat& frame) { + frame.copyTo(framebuffer(viewport)); + }, viewport(), frame_); + + window->write(); + } +}; + +OptflowDemoPlan::Params OptflowDemoPlan::params_; + +int main(int argc, char **argv) { + if (argc != 2) { + std::cerr << "Usage: optflow-demo " << endl; + exit(1); + } + + cv::Ptr plan = new OptflowDemoPlan(cv::Size(1280, 720)); + cv::Ptr window = V4D::make(plan->size(), "Sparse Optical Flow Demo", ALL); + + auto src = makeCaptureSource(window, argv[1]); + auto sink = makeWriterSink(window, "optflow-demo.mkv", src->fps(), plan->size()); + window->setSource(src); + window->setSink(sink); + + window->run(plan, 5); + return 0; +} diff --git a/modules/v4d/samples/pedestrian-demo.cpp b/modules/v4d/samples/pedestrian-demo.cpp new file mode 100644 index 000000000..28ddc9377 --- /dev/null +++ b/modules/v4d/samples/pedestrian-demo.cpp @@ -0,0 +1,292 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include +#include +#include + +#include + +using std::vector; +using std::string; + +using namespace cv::v4d; + +class PedestrianDemoPlan : public Plan { +public: + using Plan::Plan; +private: + unsigned long diag_ = 0; + cv::Size downSize_; + cv::Size_ scale_; + int blurKernelSize_ = 0; + + struct Cache { + cv::UMat blur_; + cv::UMat local_; + uint64_t fps_; + } cache_; + //BGRA + cv::UMat background_; + //RGB + cv::UMat videoFrame_, videoFrameDown_; + //GREY + cv::UMat videoFrameDownGrey_; + + struct Detection { + //detected pedestrian locations rectangles + std::vector locations_; + //detected pedestrian locations as boxes + vector> boxes_; + //probability of detected object being a pedestrian - currently always set to 1.0 + vector probs_; + //Faster tracking parameters + cv::TrackerKCF::Params params_; + //KCF tracker used instead of continous detection + cv::Ptr tracker_; + bool trackerInitialized_ = false; + //If tracking fails re-detect + bool redetect_ = true; + //Descriptor used for pedestrian detection + cv::HOGDescriptor hog_; + } detection_; + + inline static cv::Rect tracked_ = cv::Rect(0,0,1,1); + + constexpr static auto doRedect_ = [](const Detection& detection){ return !detection.trackerInitialized_ || detection.redetect_; }; + constexpr static auto dontRedect_ = [](const Detection& detection){ return detection.trackerInitialized_ && !detection.redetect_; }; + + //adapted from cv::dnn_objdetect::InferBbox + static inline bool pair_comparator(std::pair l1, std::pair l2) { + return l1.first > l2.first; + } + + //adapted from cv::dnn_objdetect::InferBbox + static void intersection_over_union(std::vector > *boxes, std::vector *base_box, std::vector *iou) { + double g_xmin = (*base_box)[0]; + double g_ymin = (*base_box)[1]; + double g_xmax = (*base_box)[2]; + double g_ymax = (*base_box)[3]; + double base_box_w = g_xmax - g_xmin; + double base_box_h = g_ymax - g_ymin; + for (size_t b = 0; b < (*boxes).size(); ++b) { + double xmin = std::max((*boxes)[b][0], g_xmin); + double ymin = std::max((*boxes)[b][1], g_ymin); + double xmax = std::min((*boxes)[b][2], g_xmax); + double ymax = std::min((*boxes)[b][3], g_ymax); + + // Intersection + double w = std::max(static_cast(0.0), xmax - xmin); + double h = std::max(static_cast(0.0), ymax - ymin); + // Union + double test_box_w = (*boxes)[b][2] - (*boxes)[b][0]; + double test_box_h = (*boxes)[b][3] - (*boxes)[b][1]; + + double inter_ = w * h; + double union_ = test_box_h * test_box_w + base_box_h * base_box_w - inter_; + (*iou)[b] = inter_ / (union_ + 1e-7); + } + } + + //adapted from cv::dnn_objdetect::InferBbox + static std::vector non_maximal_suppression(std::vector > *boxes, std::vector *probs, const double threshold = 0.1) { + std::vector keep(((*probs).size())); + std::fill(keep.begin(), keep.end(), true); + std::vector prob_args_sorted((*probs).size()); + + std::vector > temp_sort((*probs).size()); + for (size_t tidx = 0; tidx < (*probs).size(); ++tidx) { + temp_sort[tidx] = std::make_pair((*probs)[tidx], static_cast(tidx)); + } + std::sort(temp_sort.begin(), temp_sort.end(), pair_comparator); + + for (size_t idx = 0; idx < temp_sort.size(); ++idx) { + prob_args_sorted[idx] = temp_sort[idx].second; + } + + for (std::vector::iterator itr = prob_args_sorted.begin(); itr != prob_args_sorted.end() - 1; ++itr) { + size_t idx = itr - prob_args_sorted.begin(); + std::vector iou_(prob_args_sorted.size() - idx - 1); + std::vector > temp_boxes(iou_.size()); + for (size_t bb = 0; bb < temp_boxes.size(); ++bb) { + std::vector temp_box(4); + for (size_t b = 0; b < 4; ++b) { + temp_box[b] = (*boxes)[prob_args_sorted[idx + bb + 1]][b]; + } + temp_boxes[bb] = temp_box; + } + intersection_over_union(&temp_boxes, &(*boxes)[prob_args_sorted[idx]], &iou_); + for (std::vector::iterator _itr = iou_.begin(); _itr != iou_.end(); ++_itr) { + size_t iou_idx = _itr - iou_.begin(); + if (*_itr > threshold) { + keep[prob_args_sorted[idx + iou_idx + 1]] = false; + } + } + } + return keep; + } + //post process and add layers together + static void composite_layers(const cv::UMat background, const cv::UMat foreground, cv::UMat dst, int blurKernelSize, Cache& cache) { + cv::boxFilter(foreground, cache.blur_, -1, cv::Size(blurKernelSize, blurKernelSize), cv::Point(-1,-1), true, cv::BORDER_REPLICATE); + cv::add(background, cache.blur_, dst); + } +public: + PedestrianDemoPlan(const cv::Rect& viewport) : Plan(viewport) { + Global::registerShared(tracked_); + } + + PedestrianDemoPlan(const cv::Size& sz) : PedestrianDemoPlan(cv::Rect(0,0,sz.width, sz.height)) { + } + + void setup(cv::Ptr window) override { + int w = size().width; + int h = size().height; + diag_ = hypot(w, h); + downSize_ = { 640 , 360 }; + scale_ = { float(w) / downSize_.width, float(h) / downSize_.height }; + blurKernelSize_ = std::max(int(diag_ / 200 % 2 == 0 ? diag_ / 200 + 1 : diag_ / 200), 1); + + window->plain([](Detection& detection){ + detection.params_.desc_pca = cv::TrackerKCF::GRAY; + detection.params_.compress_feature = false; + detection.params_.compressed_size = 1; + detection.tracker_ = cv::TrackerKCF::create(detection.params_); + detection.hog_.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector()); + }, detection_); + } + + void infer(cv::Ptr window) override { + window->branch(always_); + { + window->capture(); + + window->fb([](const cv::UMat& frameBuffer, const cv::Rect& viewport, cv::UMat& videoFrame){ + //copy video frame + cvtColor(frameBuffer(viewport),videoFrame,cv::COLOR_BGRA2RGB); + //downsample video frame for hog_ detection + }, viewport(), videoFrame_); + + window->plain([](const cv::Size downSize, const cv::UMat& videoFrame, cv::UMat& videoFrameDown, cv::UMat& videoFrameDownGrey, cv::UMat& background){ + cv::resize(videoFrame, videoFrameDown, downSize); + cv::cvtColor(videoFrameDown, videoFrameDownGrey, cv::COLOR_RGB2GRAY); + cv::cvtColor(videoFrame, background, cv::COLOR_RGB2BGRA); + }, downSize_, videoFrame_, videoFrameDown_, videoFrameDownGrey_, background_); + } + window->endbranch(always_); + + //Try to track the pedestrian (if we currently are tracking one), else re-detect using HOG descriptor + window->branch(doRedect_, detection_); + { + window->plain([](cv::UMat& videoFrameDownGrey, Detection& detection, cv::Rect& tracked, Cache& cache){ + detection.redetect_ = false; + + //Detect pedestrians + detection.hog_.detectMultiScale(videoFrameDownGrey, detection.locations_, 0, cv::Size(), cv::Size(), 1.15, 2.0, true); + if (!detection.locations_.empty()) { + detection.boxes_.clear(); + detection.probs_.clear(); + //collect all found boxes + for (const auto &rect : detection.locations_) { + detection.boxes_.push_back( { double(rect.x), double(rect.y), double(rect.x + rect.width), double(rect.y + rect.height) }); + detection.probs_.push_back(1.0); + } + + //use nms to filter overlapping boxes (https://medium.com/analytics-vidhya/non-max-suppression-nms-6623e6572536) + vector keep = non_maximal_suppression(&detection.boxes_, &detection.probs_, 0.1); + for (size_t i = 0; i < keep.size(); ++i) { + if (keep[i]) { + Global::Scope scope(tracked); + //only track the first pedestrian found + tracked = detection.locations_[i]; + break; + } + } + + if(!detection.trackerInitialized_) { + Global::Scope scope(tracked); + //initialize the tracker once + detection.tracker_->init(videoFrameDownGrey, tracked); + detection.trackerInitialized_ = true; + } + } + }, videoFrameDownGrey_, detection_, tracked_, cache_); + } + window->endbranch(doRedect_, detection_); + + window->branch(dontRedect_, detection_); + { + window->plain([](cv::UMat& videoFrameDownGrey, Detection& detection, const uint64_t& frameCnt, cv::Rect& tracked, Cache& cache){ + Global::Scope scope(tracked); + cv::Rect oldTracked = tracked; + if((cache.fps_ == 0 || frameCnt % cache.fps_ == 0) || !detection.tracker_->update(videoFrameDownGrey, tracked)) { + cache.fps_ = uint64_t(std::ceil(Global::fps())); + //detection failed - re-detect + detection.redetect_ = true; + } + tracked.x = (oldTracked.x + tracked.x) / 2.0; + tracked.y = (oldTracked.y + tracked.y) / 2.0; + tracked.width = (oldTracked.width + tracked.width) / 2.0; + tracked.height = (oldTracked.height+ tracked.height) / 2.0; + }, videoFrameDownGrey_, detection_, window->frameCount(), tracked_, cache_); + } + window->endbranch(dontRedect_, detection_); + + window->branch(always_); + { + //Draw an ellipse around the tracked pedestrian + window->nvg([](const cv::Size& sz, const cv::Size_ scale, cv::Rect& tracked) { + using namespace cv::v4d::nvg; + float width; + float height; + float cx; + float cy; + { + Global::Scope scope(tracked); + width = tracked.width * scale.width; + height = tracked.height * scale.height; + cx = (scale.width * tracked.x + (width / 2.0)); + cy = (scale.height * tracked.y + ((height) / 2.0)); + } + + clear(); + beginPath(); + strokeWidth(std::fmax(5, sz.width / 960.0)); + strokeColor(cv::v4d::colorConvert(cv::Scalar(0, 127, 255, 200), cv::COLOR_HLS2BGR)); + ellipse(cx, cy, (width), (height)); + stroke(); + }, size(), scale_, tracked_); + + //Put it all together + window->fb([](cv::UMat& frameBuffer, const cv::Rect& viewport, cv::UMat& bg, int blurKernelSize, Cache& cache){ + composite_layers(bg, frameBuffer(viewport), frameBuffer(viewport), blurKernelSize, cache); + }, viewport(), background_, blurKernelSize_, cache_); + + window->write(); + } + window->endbranch(always_); + } +}; + + +int main(int argc, char **argv) { + if (argc != 2) { + std::cerr << "Usage: pedestrian-demo " << endl; + exit(1); + } + + cv::Ptr plan = new PedestrianDemoPlan(cv::Size(1280, 720)); + cv::Ptr window = V4D::make(plan->size(), "Pedestrian Demo", ALL); + + window->printSystemInfo(); + + auto src = makeCaptureSource(window, argv[1]); + auto sink = makeWriterSink(window, "pedestrian-demo.mkv", src->fps(), plan->size()); + window->setSource(src); + window->setSink(sink); + + window->run(plan); + + return 0; +} diff --git a/modules/v4d/samples/render_opengl.cpp b/modules/v4d/samples/render_opengl.cpp new file mode 100644 index 000000000..749905d1d --- /dev/null +++ b/modules/v4d/samples/render_opengl.cpp @@ -0,0 +1,30 @@ +#include + +using namespace cv; +using namespace cv::v4d; + +class RenderOpenGLPlan : public Plan { +public: + RenderOpenGLPlan(const cv::Size& sz) : Plan(sz) { + } + + void setup(Ptr window) override { + window->gl([]() { + //Sets the clear color to blue + glClearColor(0.0f, 0.0f, 1.0f, 1.0f); + }); + } + void infer(Ptr window) override { + window->gl([]() { + //Clears the screen. The clear color and other GL-states are preserved between context-calls. + glClear(GL_COLOR_BUFFER_BIT); + }); + } +}; + +int main() { + Ptr plan = new RenderOpenGLPlan(cv::Size(960, 960)); + Ptr window = V4D::make(plan->size(), "GL Blue Screen"); + window->run(plan); +} + diff --git a/modules/v4d/samples/scene-demo.cpp b/modules/v4d/samples/scene-demo.cpp new file mode 100644 index 000000000..33d26f727 --- /dev/null +++ b/modules/v4d/samples/scene-demo.cpp @@ -0,0 +1,67 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include +#include + +using namespace cv::v4d; + +class SceneDemoPlan : public Plan { + const string filename_ = "gear.glb"; + gl::Scene scene_; + gl::Scene pcaScene_; + std::vector pointCloud_; + + struct Transform { + cv::Vec3f translate_; + cv::Vec3f rotation_; + cv::Vec3f scale_; + cv::Matx44f projection_; + cv::Matx44f view_; + cv::Matx44f model_; + } transform_; +public: + using Plan::Plan; + + void setup(cv::Ptr window) override { + window->gl([](gl::Scene& scene, const string& filename){ + CV_Assert(scene.load(filename)); + }, scene_, filename_); + } + + void infer(cv::Ptr window) override { + window->gl(0,[](const int32_t& ctx, const cv::Rect& viewport, gl::Scene& scene, std::vector& pointCloud, Transform& transform){ + glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); + double progress = (cv::getTickCount() / cv::getTickFrequency()) / 5.0; + float angle = fmod(double(cv::getTickCount()) / double(cv::getTickFrequency()), 2 * M_PI); + int m = int(progress) % 3; + + float scale = scene.autoScale(); + cv::Vec3f center = scene.autoCenter(); + transform.rotation_ = {0, angle, 0}; + transform.translate_ = {-center[0], -center[1], -center[2]}; + transform.scale_ = { scale, scale, scale }; + transform.projection_ = gl::perspective(45.0f * (CV_PI/180), float(viewport.width) / viewport.height, 0.1f, 100.0f); + transform.view_ = gl::lookAt(cv::Vec3f(0.0f, 0.0f, 3.0f), cv::Vec3f(0.0f, 0.0f, 0.0f), cv::Vec3f(0.0f, 1.0f, 0.0f)); + transform.model_ = gl::modelView(transform.translate_, transform.rotation_, transform.scale_); + + scene.setMode(static_cast(m)); + scene.render(viewport, transform.projection_, transform.view_, transform.model_); + }, viewport(), scene_, pointCloud_, transform_); + window->write(); + } +}; + +int main() { + cv::Ptr window = V4D::make(cv::Size(1280, 720), "Scene Demo", IMGUI); + cv::Ptr plan = new SceneDemoPlan(cv::Size(1280, 720)); + + //Creates a writer sink (which might be hardware accelerated) + auto sink = makeWriterSink(window, "scene-demo.mkv", 60, plan->size()); + window->setSink(sink); + window->run(plan, 3); + + return 0; +} diff --git a/modules/v4d/samples/shader-demo.cpp b/modules/v4d/samples/shader-demo.cpp new file mode 100644 index 000000000..5bc07c467 --- /dev/null +++ b/modules/v4d/samples/shader-demo.cpp @@ -0,0 +1,348 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include + +using namespace cv::v4d; + +class ShaderDemoPlan : public Plan { +public: + using Plan::Plan; + + //A value greater 1 will enable experimental tiling with one context per tile. + constexpr static size_t TILING_ = 1; + constexpr static size_t NUM_CONTEXTS_ = TILING_ * TILING_; +private: + // vertex position, color + constexpr static float vertices[12] = { + // x y z + -1.0f, -1.0f, -0.0f, 1.0f, 1.0f, -0.0f, -1.0f, 1.0f, -0.0f, 1.0f, -1.0f, -0.0f }; + + constexpr static unsigned int indices[6] = { + // 2---,1 + // | .' | + // 0'---3 + 0, 1, 2, 0, 3, 1 }; + + static struct Params { + /* Mandelbrot control parameters */ + // Red, green, blue and alpha. All from 0.0f to 1.0f + float baseColorVal_[4] = {0.2, 0.6, 1.0, 0.8}; + //contrast boost + int contrastBoost_ = 255; //0.0-255 + //max fractal iterations + int maxIterations_ = 50000; + //center x coordinate + float centerX_ = -0.466; + //center y coordinate + float centerY_ = 0.57052; + float zoomFactor_ = 1.0; + float currentZoom_ = 4.0; + bool zoomIn = true; + float zoomIncr_ = -currentZoom_ / 1000; + bool manualNavigation_ = false; + } params_; + + struct Handles { + /* GL uniform handles */ + GLint baseColorHdl_; + GLint contrastBoostHdl_; + GLint maxIterationsHdl_; + GLint centerXHdl_; + GLint centerYHdl_; + GLint offsetXHdl_; + GLint offsetYHdl_; + GLint currentZoomHdl_; + GLint resolutionHdl_; + + /* Shader program handle */ + GLuint shaderHdl_; + + /* Object handles */ + GLuint vao_; + GLuint vbo_, ebo_; + } handles_[NUM_CONTEXTS_]; + + cv::Rect viewports_[NUM_CONTEXTS_]; + + struct Cache { + cv::UMat down; + cv::UMat up; + cv::UMat blur; + cv::UMat dst16; + } cache_; + + //easing function for the bungee zoom + static float easeInOutQuint(float x) { + return x < 0.5f ? 16.0f * x * x * x * x * x : 1.0f - std::pow(-2.0f * x + 2.0f, 5.0f) / 2.0f; + } + + //Load objects and buffers + static void load_buffers(Handles& handles) { + GL_CHECK(glGenVertexArrays(1, &handles.vao_)); + GL_CHECK(glBindVertexArray(handles.vao_)); + + GL_CHECK(glGenBuffers(1, &handles.vbo_)); + GL_CHECK(glGenBuffers(1, &handles.ebo_)); + + GL_CHECK(glBindBuffer(GL_ARRAY_BUFFER, handles.vbo_)); + GL_CHECK(glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW)); + + GL_CHECK(glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, handles.ebo_)); + GL_CHECK(glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(indices), indices, GL_STATIC_DRAW)); + + GL_CHECK(glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(float), (void*) 0)); + GL_CHECK(glEnableVertexAttribArray(0)); + + GL_CHECK(glBindBuffer(GL_ARRAY_BUFFER, 0)); + GL_CHECK(glBindVertexArray(0)); + } + + //mandelbrot shader code adapted from my own project: https://github.com/kallaballa/FractalDive#after + static GLuint load_shader() { + #if !defined(OPENCV_V4D_USE_ES3) + const string shaderVersion = "330"; + #else + const string shaderVersion = "300 es"; + #endif + + const string vert = + " #version " + shaderVersion + + R"( + in vec4 position; + + void main() + { + gl_Position = vec4(position.xyz, 1.0); + })"; + + const string frag = + " #version " + shaderVersion + + R"( + precision highp float; + + out vec4 outColor; + + uniform vec4 base_color; + uniform int contrast_boost; + uniform int max_iterations; + uniform float current_zoom; + uniform float center_y; + uniform float center_x; + uniform float offset_y; + uniform float offset_x; + + uniform vec2 resolution; + + int get_iterations() + { + float pointr = (((gl_FragCoord.x / resolution[0]) - 0.5f) * current_zoom + center_x); + float pointi = (((gl_FragCoord.y / resolution[1]) - 0.5f) * current_zoom + center_y); + const float four = 4.0f; + + int iterations = 0; + float zi = 0.0f; + float zr = 0.0f; + float zrsqr = 0.0f; + float zisqr = 0.0f; + + while (iterations < max_iterations && zrsqr + zisqr < four) { + //equals following line as a consequence of binomial expansion: zi = (((zr + zi)*(zr + zi)) - zrsqr) - zisqr + zi = (zr + zr) * zi; + + zi += pointi; + zr = (zrsqr - zisqr) + pointr; + + zrsqr = zr * zr; + zisqr = zi * zi; + ++iterations; + } + return iterations; + } + + void mandelbrot() + { + int iter = get_iterations(); + if (iter < max_iterations) { + float iterations = float(iter) / float(max_iterations); + float cb = float(contrast_boost); + float logBase; + if(iter % 2 == 0) + logBase = 25.0f; + else + logBase = 50.0f; + + float logDiv = log2(logBase); + float colorBoost = iterations * cb; + outColor = vec4(log2((logBase - 1.0f) * base_color[0] * colorBoost + 1.0f)/logDiv, + log2((logBase - 1.0f) * base_color[1] * colorBoost + 1.0f)/logDiv, + log2((logBase - 1.0f) * base_color[2] * colorBoost + 1.0f)/logDiv, + base_color[3]); + } else { + outColor = vec4(0,0,0,0); + } + } + + void main() + { + mandelbrot(); + })"; + unsigned int handles[3]; + cv::v4d::initShader(handles, vert.c_str(), frag.c_str(), "fragColor"); + return handles[0]; + } + + //Initialize shaders, objects, buffers and uniforms + static void init_scene(const cv::Rect& viewport, Handles& handles) { + GL_CHECK(glEnable(GL_BLEND)); + GL_CHECK(glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA)); + handles.shaderHdl_ = load_shader(); + load_buffers(handles); + + handles.baseColorHdl_ = glGetUniformLocation(handles.shaderHdl_, "base_color"); + handles.contrastBoostHdl_ = glGetUniformLocation(handles.shaderHdl_, "contrast_boost"); + handles.maxIterationsHdl_ = glGetUniformLocation(handles.shaderHdl_, "max_iterations"); + handles.currentZoomHdl_ = glGetUniformLocation(handles.shaderHdl_, "current_zoom"); + handles.centerXHdl_ = glGetUniformLocation(handles.shaderHdl_, "center_x"); + handles.centerYHdl_ = glGetUniformLocation(handles.shaderHdl_, "center_y"); + handles.offsetXHdl_ = glGetUniformLocation(handles.shaderHdl_, "offset_x"); + handles.offsetYHdl_ = glGetUniformLocation(handles.shaderHdl_, "offset_y"); + handles.resolutionHdl_ = glGetUniformLocation(handles.shaderHdl_, "resolution"); + GL_CHECK(glViewport(viewport.x, viewport.y, viewport.width, viewport.height)); + } + + //Free OpenGL resources + static void destroy_scene(Handles& handles) { + glDeleteShader(handles.shaderHdl_); + glDeleteBuffers(1, &handles.vbo_); + glDeleteBuffers(1, &handles.ebo_); + glDeleteVertexArrays(1, &handles.vao_); + } + + //Render the mandelbrot fractal on top of a video + static void render_scene(const cv::Size& sz, const cv::Rect& viewport, Params& params, Handles& handles) { + GL_CHECK(glViewport(viewport.x, viewport.y, viewport.width, viewport.height)); + + //bungee zoom + if (params.currentZoom_ >= 3) { + params.zoomIn = true; + } else if (params.currentZoom_ < 0.05) { + params.zoomIn = false; + } + + params.zoomIncr_ = (params.currentZoom_ / 100); + if(params.zoomIn) + params.zoomIncr_ = -params.zoomIncr_; + + GL_CHECK(glUseProgram(handles.shaderHdl_)); + GL_CHECK(glUniform4f(handles.baseColorHdl_, params.baseColorVal_[0], params.baseColorVal_[1], params.baseColorVal_[2], params.baseColorVal_[3])); + GL_CHECK(glUniform1i(handles.contrastBoostHdl_, params.contrastBoost_)); + GL_CHECK(glUniform1i(handles.maxIterationsHdl_, params.maxIterations_)); + GL_CHECK(glUniform1f(handles.centerYHdl_, params.centerY_)); + GL_CHECK(glUniform1f(handles.centerXHdl_, params.centerX_)); + GL_CHECK(glUniform1f(handles.offsetYHdl_, viewport.x)); + GL_CHECK(glUniform1f(handles.offsetXHdl_, viewport.y)); + + if (!params.manualNavigation_) { + params.currentZoom_ += params.zoomIncr_; + GL_CHECK(glUniform1f(handles.currentZoomHdl_, easeInOutQuint(params.currentZoom_))); + } else { + params.currentZoom_ = 1.0 / pow(params.zoomFactor_, 5.0f); + GL_CHECK(glUniform1f(handles.currentZoomHdl_, params.currentZoom_)); + } + float res[2] = {float(sz.width), float(sz.height)}; + GL_CHECK(glUniform2fv(handles.resolutionHdl_, 1, res)); + + GL_CHECK(glBindVertexArray(handles.vao_)); + GL_CHECK(glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, 0)); + } +public: + ShaderDemoPlan(const cv::Rect& viewport) : Plan(viewport) { + Global::registerShared(params_); + } + + ShaderDemoPlan(const cv::Size& sz) : ShaderDemoPlan(cv::Rect(0,0,sz.width, sz.height)) { + } + + void gui(cv::Ptr window) override { + window->imgui([](cv::Ptr win, ImGuiContext* ctx, Params& params) { + CV_UNUSED(win); + using namespace ImGui; + SetCurrentContext(ctx); + Begin("Fractal"); + Text("Navigation"); + SliderInt("Iterations", ¶ms.maxIterations_, 3, 100000); + DragFloat("X", ¶ms.centerX_, 0.000001, -1.0f, 1.0f); + DragFloat("Y", ¶ms.centerY_, 0.000001, -1.0f, 1.0f); + if(SliderFloat("Zoom", ¶ms.zoomFactor_, 0.0001f, 10.0f)) + params.manualNavigation_ = true; + Text("Color"); + ColorPicker4("Color", params.baseColorVal_); + SliderInt("Contrast boost", ¶ms.contrastBoost_, 1, 255); + End(); + }, params_); + } + + void setup(cv::Ptr window) override { + float w = size().width; + float h = size().height; + float tw = w / TILING_; + float th = h / TILING_; + + for(size_t i = 0; i < TILING_; ++i) { + for(size_t j = 0; j < TILING_; ++j) { + viewports_[i * TILING_ + j] = cv::Rect(tw * i, th * j, tw - 1, th - 1); + } + } + + for(size_t i = 0; i < NUM_CONTEXTS_; ++i) { + window->gl(i, [](const int32_t& ctxID, const cv::Rect& viewport, Handles& handles) { + init_scene(viewport, handles); + }, viewports_[i], handles_[i]); + } + } + + void infer(cv::Ptr window) override { + window->capture(); + + for(size_t i = 0; i < NUM_CONTEXTS_; ++i) { + window->gl(i,[](const int32_t& ctxID, const cv::Size& sz, const cv::Rect& viewport, Params& params, Handles& handles) { + Params p = Global::safe_copy(params); + render_scene(sz, viewport, p, handles); + }, size(), viewports_[i], params_, handles_[i]); + } + + window->write(); + } + + void teardown(cv::Ptr window) override { + for(size_t i = 0; i < NUM_CONTEXTS_; ++i) { + window->gl(i, [](const int32_t& ctxID, Handles& handles) { + destroy_scene(handles); + }, handles_[i]); + } + } +}; + +ShaderDemoPlan::Params ShaderDemoPlan::params_; + +int main(int argc, char** argv) { + if (argc != 2) { + cerr << "Usage: shader-demo " << endl; + exit(1); + } + + cv::Ptr plan = new ShaderDemoPlan(cv::Size(1280, 720)); + cv::Ptr window = V4D::make(plan->size(), "Mandelbrot Shader Demo", IMGUI); + + auto src = makeCaptureSource(window, argv[1]); + auto sink = makeWriterSink(window, "shader-demo.mkv", src->fps(), plan->size()); + window->setSource(src); + window->setSink(sink); + + window->run(plan); + + return 0; +} diff --git a/modules/v4d/samples/vector_graphics.cpp b/modules/v4d/samples/vector_graphics.cpp new file mode 100644 index 000000000..78ef2af98 --- /dev/null +++ b/modules/v4d/samples/vector_graphics.cpp @@ -0,0 +1,111 @@ +#include + +using namespace cv; +using namespace cv::v4d; + +class VectorGraphicsPlan: public Plan { +public: + VectorGraphicsPlan(const cv::Size& sz) : Plan(sz) { + } + + void infer(Ptr win) override { + //Creates a NanoVG context and draws googly eyes that occasionally blink. + win->nvg([](const Size &sz) { + //Calls from this namespace may only be used inside a nvg context. + //Nvg calls work exactly like their c-funtion counterparts. + //Please refer to the NanoVG documentation for details. + using namespace cv::v4d::nvg; + clear(); + + static long start = cv::getTickCount() / cv::getTickFrequency(); + float t = cv::getTickCount() / cv::getTickFrequency() - start; + float x = 0; + float y = 0; + float w = sz.width / 4; + float h = sz.height / 4; + translate((sz.width / 2.0f) - (w / 2.0f), (sz.height / 2.0f) - (h / 2.0f)); + float mx = w / 2.0; + float my = h / 2.0; + Paint gloss, bg; + float ex = w * 0.23f; + float ey = h * 0.5f; + float lx = x + ex; + float ly = y + ey; + float rx = x + w - ex; + float ry = y + ey; + float dx, dy, d; + float br = (ex < ey ? ex : ey) * 0.5f; + float blink = 1 - pow(sinf(t * 0.5f), 200) * 0.8f; + + bg = linearGradient(x, y + h * 0.5f, x + w * 0.1f, y + h, + cv::Scalar(0, 0, 0, 32), cv::Scalar(0, 0, 0, 16)); + beginPath(); + ellipse(lx + 3.0f, ly + 16.0f, ex, ey); + ellipse(rx + 3.0f, ry + 16.0f, ex, ey); + fillPaint(bg); + fill(); + + bg = linearGradient(x, y + h * 0.25f, x + w * 0.1f, y + h, + cv::Scalar(220, 220, 220, 255), + cv::Scalar(128, 128, 128, 255)); + beginPath(); + ellipse(lx, ly, ex, ey); + ellipse(rx, ry, ex, ey); + fillPaint(bg); + fill(); + + dx = (mx - rx) / (ex * 10); + dy = (my - ry) / (ey * 10); + d = sqrtf(dx * dx + dy * dy); + if (d > 1.0f) { + dx /= d; + dy /= d; + } + dx *= ex * 0.4f; + dy *= ey * 0.5f; + beginPath(); + ellipse(lx + dx, ly + dy + ey * 0.25f * (1 - blink), br, + br * blink); + fillColor(cv::Scalar(32, 32, 32, 255)); + fill(); + + dx = (mx - rx) / (ex * 10); + dy = (my - ry) / (ey * 10); + d = sqrtf(dx * dx + dy * dy); + if (d > 1.0f) { + dx /= d; + dy /= d; + } + dx *= ex * 0.4f; + dy *= ey * 0.5f; + beginPath(); + ellipse(rx + dx, ry + dy + ey * 0.25f * (1 - blink), br, + br * blink); + fillColor(cv::Scalar(32, 32, 32, 255)); + fill(); + + gloss = radialGradient(lx - ex * 0.25f, ly - ey * 0.5f, + ex * 0.1f, ex * 0.75f, cv::Scalar(255, 255, 255, 128), + cv::Scalar(255, 255, 255, 0)); + beginPath(); + ellipse(lx, ly, ex, ey); + fillPaint(gloss); + fill(); + + gloss = radialGradient(rx - ex * 0.25f, ry - ey * 0.5f, + ex * 0.1f, ex * 0.75f, cv::Scalar(255, 255, 255, 128), + cv::Scalar(255, 255, 255, 0)); + beginPath(); + ellipse(rx, ry, ex, ey); + fillPaint(gloss); + fill(); + }, win->fbSize()); + } +}; + +int main() { + Ptr plan = new VectorGraphicsPlan(cv::Size(960, 960)); + Ptr window = V4D::make(plan->size(), "Vector Graphics"); + window->run(plan); +} + diff --git a/modules/v4d/samples/vector_graphics_and_fb.cpp b/modules/v4d/samples/vector_graphics_and_fb.cpp new file mode 100644 index 000000000..55daace27 --- /dev/null +++ b/modules/v4d/samples/vector_graphics_and_fb.cpp @@ -0,0 +1,110 @@ +#include +#include + +using namespace cv; +using namespace cv::v4d; + +class VectorGraphicsAndFBPlan : public Plan { +public: + VectorGraphicsAndFBPlan(const cv::Size& sz) : Plan(sz) { + } + + void infer(Ptr window) override { + //Again creates a NanoVG context and draws googly eyes + window->nvg([](const Size& sz) { + //Calls from this namespace may only be used inside a nvg context + using namespace cv::v4d::nvg; + clear(); + + static long start = cv::getTickCount() / cv::getTickFrequency(); + float t = cv::getTickCount() / cv::getTickFrequency() - start; + float x = 0; + float y = 0; + float w = sz.width / 4; + float h = sz.height / 4; + translate((sz.width / 2.0f) - (w / 2.0f), (sz.height / 2.0f) - (h / 2.0f)); + float mx = w / 2.0; + float my = h / 2.0; + Paint gloss, bg; + float ex = w * 0.23f; + float ey = h * 0.5f; + float lx = x + ex; + float ly = y + ey; + float rx = x + w - ex; + float ry = y + ey; + float dx, dy, d; + float br = (ex < ey ? ex : ey) * 0.5f; + float blink = 1 - pow(sinf(t * 0.5f), 200) * 0.8f; + + bg = linearGradient(x, y + h * 0.5f, x + w * 0.1f, y + h, cv::Scalar(0, 0, 0, 32), cv::Scalar(0,0,0,16)); + beginPath(); + ellipse(lx + 3.0f, ly + 16.0f, ex, ey); + ellipse(rx + 3.0f, ry + 16.0f, ex, ey); + fillPaint(bg); + fill(); + + bg = linearGradient(x, y + h * 0.25f, x + w * 0.1f, y + h, + cv::Scalar(220, 220, 220, 255), cv::Scalar(128, 128, 128, 255)); + beginPath(); + ellipse(lx, ly, ex, ey); + ellipse(rx, ry, ex, ey); + fillPaint(bg); + fill(); + + dx = (mx - rx) / (ex * 10); + dy = (my - ry) / (ey * 10); + d = sqrtf(dx * dx + dy * dy); + if (d > 1.0f) { + dx /= d; + dy /= d; + } + dx *= ex * 0.4f; + dy *= ey * 0.5f; + beginPath(); + ellipse(lx + dx, ly + dy + ey * 0.25f * (1 - blink), br, br * blink); + fillColor(cv::Scalar(32, 32, 32, 255)); + fill(); + + dx = (mx - rx) / (ex * 10); + dy = (my - ry) / (ey * 10); + d = sqrtf(dx * dx + dy * dy); + if (d > 1.0f) { + dx /= d; + dy /= d; + } + dx *= ex * 0.4f; + dy *= ey * 0.5f; + beginPath(); + ellipse(rx + dx, ry + dy + ey * 0.25f * (1 - blink), br, br * blink); + fillColor(cv::Scalar(32, 32, 32, 255)); + fill(); + + gloss = radialGradient(lx - ex * 0.25f, ly - ey * 0.5f, ex * 0.1f, ex * 0.75f, + cv::Scalar(255, 255, 255, 128), cv::Scalar(255, 255, 255, 0)); + beginPath(); + ellipse(lx, ly, ex, ey); + fillPaint(gloss); + fill(); + + gloss = radialGradient(rx - ex * 0.25f, ry - ey * 0.5f, ex * 0.1f, ex * 0.75f, + cv::Scalar(255, 255, 255, 128), cv::Scalar(255, 255, 255, 0)); + beginPath(); + ellipse(rx, ry, ex, ey); + fillPaint(gloss); + fill(); + }, window->fbSize()); + + //Provides the framebuffer as left-off by the nvg context. + window->fb([](UMat& framebuffer) { + //Heavily blurs the eyes using a cheap boxFilter + boxFilter(framebuffer, framebuffer, -1, Size(15, 15), Point(-1,-1), true, BORDER_REPLICATE); + }); + } +}; +int main() { + Ptr plan = new VectorGraphicsAndFBPlan(cv::Size(960, 960)); + Ptr window = V4D::make(plan->size(), "Vector Graphics and Framebuffer"); + window->run(plan); +} + + diff --git a/modules/v4d/samples/video-demo.cpp b/modules/v4d/samples/video-demo.cpp new file mode 100644 index 000000000..8f638970e --- /dev/null +++ b/modules/v4d/samples/video-demo.cpp @@ -0,0 +1,228 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +/* + * Based on cube-demo. Only differs in two points: + * - Uses a source to read a video. + * - Doesn't clear the background so the cube is rendered on top of the video. + */ + +#include + +using std::cerr; +using std::endl; + +using namespace cv::v4d; + +class VideoDemoPlan: public Plan { +public: + using Plan::Plan; + /* Demo Parameters */ + int glowKernelSize_ = 0; +private: + struct Cache { + cv::UMat up_; + cv::UMat down_; + cv::UMat blur_; + cv::UMat dst16_; + } cache_; + + /* OpenGL constants */ + constexpr static GLuint TRIANGLES_ = 12; + constexpr static GLuint VERTICES_INDEX_ = 0; + constexpr static GLuint COLOR_INDEX_ = 1; + + constexpr static float VERTICES_[24] = { + // Front face + 0.5, 0.5, 0.5, -0.5, 0.5, 0.5, -0.5, -0.5, 0.5, 0.5, -0.5, 0.5, + + // Back face + 0.5, 0.5, -0.5, -0.5, 0.5, -0.5, -0.5, -0.5, -0.5, 0.5, -0.5, -0.5, }; + + constexpr static float VERTEX_COLORS[24] = { 1.0, 0.4, 0.6, 1.0, 0.9, 0.2, 0.7, 0.3, 0.8, 0.5, 0.3, 1.0, + + 0.2, 0.6, 1.0, 0.6, 1.0, 0.4, 0.6, 0.8, 0.8, 0.4, 0.8, 0.8, }; + + constexpr static unsigned short TRIANGLE_INDICES_[36] = { + // Front + 0, 1, 2, 2, 3, 0, + + // Right + 0, 3, 7, 7, 4, 0, + + // Bottom + 2, 6, 7, 7, 3, 2, + + // Left + 1, 5, 6, 6, 2, 1, + + // Back + 4, 7, 6, 6, 5, 4, + + // Top + 5, 1, 0, 0, 4, 5, }; + /* OpenGL variables */ + GLuint vao_ = 0; + GLuint shader_ = 0; + GLuint uniform_transform_ = 0; + + static GLuint load_shader() { + #if !defined(OPENCV_V4D_USE_ES3) + const string shaderVersion = "330"; + #else + const string shaderVersion = "300 es"; + #endif + + const string vert = + " #version " + shaderVersion + + R"( + precision lowp float; + layout(location = 0) in vec3 pos; + layout(location = 1) in vec3 vertex_color; + + uniform mat4 transform; + + out vec3 color; + void main() { + gl_Position = transform * vec4(pos, 1.0); + color = vertex_color; + } + )"; + + const string frag = + " #version " + shaderVersion + + R"( + precision lowp float; + in vec3 color; + + out vec4 frag_color; + + void main() { + frag_color = vec4(color, 1.0); + } + )"; + + unsigned int handles[3]; + cv::v4d::initShader(handles, vert.c_str(), frag.c_str(), "fragColor"); + return handles[0]; + } + + static void init_scene(GLuint& vao, GLuint& shader, GLuint& uniformTrans) { + glEnable (GL_DEPTH_TEST); + + glGenVertexArrays(1, &vao); + glBindVertexArray(vao); + + unsigned int triangles_ebo; + glGenBuffers(1, &triangles_ebo); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, triangles_ebo); + glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof TRIANGLE_INDICES_, TRIANGLE_INDICES_, + GL_STATIC_DRAW); + + unsigned int verticies_vbo; + glGenBuffers(1, &verticies_vbo); + glBindBuffer(GL_ARRAY_BUFFER, verticies_vbo); + glBufferData(GL_ARRAY_BUFFER, sizeof VERTICES_, VERTICES_, GL_STATIC_DRAW); + + glVertexAttribPointer(VERTICES_INDEX_, 3, GL_FLOAT, GL_FALSE, 0, NULL); + glEnableVertexAttribArray(VERTICES_INDEX_); + + unsigned int colors_vbo; + glGenBuffers(1, &colors_vbo); + glBindBuffer(GL_ARRAY_BUFFER, colors_vbo); + glBufferData(GL_ARRAY_BUFFER, sizeof VERTEX_COLORS, VERTEX_COLORS, GL_STATIC_DRAW); + + glVertexAttribPointer(COLOR_INDEX_, 3, GL_FLOAT, GL_FALSE, 0, NULL); + glEnableVertexAttribArray(COLOR_INDEX_); + + glBindVertexArray(0); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0); + glBindBuffer(GL_ARRAY_BUFFER, 0); + + shader = load_shader(); + uniformTrans = glGetUniformLocation(shader, "transform"); + } + + static void render_scene(GLuint& vao, GLuint& shader, GLuint& uniformTrans) { + glUseProgram(shader); + + float angle = fmod(double(cv::getTickCount()) / double(cv::getTickFrequency()), 2 * M_PI); + float scale = 0.25; + + cv::Matx44f scaleMat(scale, 0.0, 0.0, 0.0, 0.0, scale, 0.0, 0.0, 0.0, 0.0, scale, 0.0, 0.0, 0.0, + 0.0, 1.0); + + cv::Matx44f rotXMat(1.0, 0.0, 0.0, 0.0, 0.0, cos(angle), -sin(angle), 0.0, 0.0, sin(angle), + cos(angle), 0.0, 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotYMat(cos(angle), 0.0, sin(angle), 0.0, 0.0, 1.0, 0.0, 0.0, -sin(angle), 0.0, + cos(angle), 0.0, 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotZMat(cos(angle), -sin(angle), 0.0, 0.0, sin(angle), cos(angle), 0.0, 0.0, 0.0, + 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f transform = scaleMat * rotXMat * rotYMat * rotZMat; + glUniformMatrix4fv(uniformTrans, 1, GL_FALSE, transform.val); + glBindVertexArray(vao); + glDrawElements(GL_TRIANGLES, TRIANGLES_ * 3, GL_UNSIGNED_SHORT, NULL); + + } + + static void glow_effect(const cv::UMat& src, cv::UMat& dst, const int ksize, Cache& cache) { + cv::bitwise_not(src, dst); + + cv::resize(dst, cache.down_, cv::Size(), 0.5, 0.5); + cv::boxFilter(cache.down_, cache.blur_, -1, cv::Size(ksize, ksize), cv::Point(-1, -1), true, + cv::BORDER_REPLICATE); + cv::resize(cache.blur_, cache.up_, src.size()); + + cv::multiply(dst, cache.up_, cache.dst16_, 1, CV_16U); + cv::divide(cache.dst16_, cv::Scalar::all(255.0), dst, 1, CV_8U); + + cv::bitwise_not(dst, dst); + } +public: + void setup(cv::Ptr window) override { + int diag = hypot(double(size().width), double(size().height)); + glowKernelSize_ = std::max(int(diag / 138 % 2 == 0 ? diag / 138 + 1 : diag / 138), 1); + + window->gl([](GLuint& vao, GLuint& shader, GLuint& uniformTrans) { + init_scene(vao, shader, uniformTrans); + }, vao_, shader_, uniform_transform_); + } + void infer(cv::Ptr window) override { + window->capture(); + + window->gl([](GLuint& vao, GLuint& shader, GLuint& uniformTrans) { + render_scene(vao, shader, uniformTrans); + }, vao_, shader_, uniform_transform_); + + window->fb([](cv::UMat& framebuffer, const cv::Rect& viewport, int glowKernelSize, Cache& cache) { + cv::UMat roi = framebuffer(viewport); + glow_effect(roi, roi, glowKernelSize, cache); + }, viewport(), glowKernelSize_, cache_); + + window->write(); + } +}; + +int main(int argc, char** argv) { + if (argc != 2) { + cerr << "Usage: video-demo " << endl; + exit(1); + } + + cv::Ptr plan = new VideoDemoPlan(cv::Size(1280,720)); + cv::Ptr window = V4D::make(plan->size(), "Video Demo", NONE); + + auto src = makeCaptureSource(window, argv[1]); + auto sink = makeWriterSink(window, "video-demo.mkv", src->fps(), plan->size()); + window->setSource(src); + window->setSink(sink); + + window->run(plan); + + return 0; +} diff --git a/modules/v4d/samples/video_editing.cpp b/modules/v4d/samples/video_editing.cpp new file mode 100644 index 000000000..1aa3b0607 --- /dev/null +++ b/modules/v4d/samples/video_editing.cpp @@ -0,0 +1,53 @@ +#include + +using namespace cv; +using namespace cv::v4d; + +class VideoEditingPlan : public Plan { + cv::UMat frame_; + const string hv_ = "Hello Video!"; +public: + VideoEditingPlan(const cv::Size& sz) : Plan(sz) { + } + + void infer(Ptr win) override { + //Capture video from the source + win->capture(); + + //Render on top of the video + win->nvg([](const Size& sz, const string& str) { + using namespace cv::v4d::nvg; + + fontSize(40.0f); + fontFace("sans-bold"); + fillColor(Scalar(255, 0, 0, 255)); + textAlign(NVG_ALIGN_CENTER | NVG_ALIGN_TOP); + text(sz.width / 2.0, sz.height / 2.0, str.c_str(), str.c_str() + str.size()); + }, win->fbSize(), hv_); + + //Write video to the sink (do nothing in case of WebAssembly) + win->write(); + } +}; + +int main(int argc, char** argv) { + if (argc != 3) { + cerr << "Usage: video_editing " << endl; + exit(1); + } + Ptr plan = new VideoEditingPlan(cv::Size(960,960)); + Ptr window = V4D::make(plan->size(), "Video Editing"); + + //Make the video source + auto src = makeCaptureSource(window, argv[1]); + + //Make the video sink + auto sink = makeWriterSink(window, argv[2], src->fps(), plan->size()); + + //Attach source and sink + window->setSource(src); + window->setSink(sink); + + window->run(plan); +} + diff --git a/modules/v4d/src/detail/framebuffercontext.cpp b/modules/v4d/src/detail/framebuffercontext.cpp new file mode 100644 index 000000000..9da658b62 --- /dev/null +++ b/modules/v4d/src/detail/framebuffercontext.cpp @@ -0,0 +1,793 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/detail/framebuffercontext.hpp" +#include "opencv2/v4d/v4d.hpp" +#include "opencv2/v4d/util.hpp" +#include "opencv2/core/ocl.hpp" +#include "opencv2/v4d/detail/gl.hpp" + +#include "opencv2/core/opengl.hpp" +#include +#include +#include +#include "imgui_impl_glfw.h" +#define GLFW_INCLUDE_NONE +#include +using std::cerr; +using std::cout; +using std::endl; + +namespace cv { +namespace v4d { + +namespace detail { + +static void glfw_error_callback(int error, const char* description) { +#ifndef NDEBUG + fprintf(stderr, "GLFW Error: (%d) %s\n", error, description); +#endif +} + +bool FrameBufferContext::firstSync_ = true; + +int frameBufferContextCnt = 0; + +FrameBufferContext::FrameBufferContext(V4D& v4d, const string& title, cv::Ptr other) : + FrameBufferContext(v4d, other->framebufferSize_, !other->debug_, title, other->major_, other->minor_, other->samples_, other->debug_, other->rootWindow_, other, false) { +} + +FrameBufferContext::FrameBufferContext(V4D& v4d, const cv::Size& framebufferSize, bool offscreen, + const string& title, int major, int minor, int samples, bool debug, GLFWwindow* rootWindow, cv::Ptr parent, bool root) : + v4d_(&v4d), offscreen_(offscreen), title_(title), major_(major), minor_( + minor), samples_(samples), debug_(debug), isVisible_(offscreen), viewport_(0, 0, framebufferSize.width, framebufferSize.height), framebufferSize_(framebufferSize), hasParent_(false), rootWindow_(rootWindow), parent_(parent), framebuffer_(), isRoot_(root) { + init(); + index_ = ++frameBufferContextCnt; +} + +FrameBufferContext::~FrameBufferContext() { + teardown(); +} + +GLuint FrameBufferContext::getFramebufferID() { + return frameBufferID_; +} + +GLuint FrameBufferContext::getTextureID() { + return textureID_; +} + + +void FrameBufferContext::loadShader(const size_t& index) { +#if !defined(OPENCV_V4D_USE_ES3) + const string shaderVersion = "330"; +#else + const string shaderVersion = "300 es"; +#endif + + const string vert = + " #version " + shaderVersion + + R"( + layout (location = 0) in vec3 aPos; + + void main() + { + gl_Position = vec4(aPos, 1.0); + } +)"; + + const string frag = + " #version " + shaderVersion + + R"( + precision mediump float; + out vec4 FragColor; + + uniform sampler2D texture0; + uniform vec2 resolution; + + void main() + { + //translate screen coordinates to texture coordinates and flip the y-axis. + vec4 texPos = gl_FragCoord / vec4(resolution.x, resolution.y * -1.0f, 1.0, 1.0); + vec4 texColor0 = texture(texture0, texPos.xy); + if(texColor0.a == 0.0) + discard; + else + FragColor = texColor0; + } +)"; + + unsigned int handles[3]; + cv::v4d::initShader(handles, vert.c_str(), frag.c_str(), "fragColor"); + shader_program_hdls_[index] = handles[0]; +} + +void FrameBufferContext::loadBuffers(const size_t& index) { + glGenVertexArrays(1, ©Vaos[index]); + glBindVertexArray(copyVaos[index]); + + glGenBuffers(1, ©Vbos[index]); + glGenBuffers(1, ©Ebos[index]); + + glBindBuffer(GL_ARRAY_BUFFER, copyVbos[index]); + glBufferData(GL_ARRAY_BUFFER, sizeof(copyVertices), copyVertices, GL_STATIC_DRAW); + + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, copyEbos[index]); + glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(copyIndices), copyIndices, GL_STATIC_DRAW); + + glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(float), (void*) 0); + glEnableVertexAttribArray(0); + + glBindBuffer(GL_ARRAY_BUFFER, 0); + glBindVertexArray(0); +} + +void FrameBufferContext::init() { + static std::mutex initMtx; + std::unique_lock lock(initMtx); + + if(parent_) { + hasParent_ = true; + + if(isRoot()) { + textureID_ = 0; + renderBufferID_ = 0; + onscreenTextureID_ = parent_->textureID_; + onscreenRenderBufferID_ = parent_->renderBufferID_; + } else { + textureID_ = parent_->textureID_; + renderBufferID_ = parent_->renderBufferID_; + onscreenTextureID_ = parent_->onscreenTextureID_; + onscreenRenderBufferID_ = parent_->onscreenRenderBufferID_; + } + } else if (glfwInit() != GLFW_TRUE) { + cerr << "Can't init GLFW" << endl; + exit(1); + } + + glfwSetErrorCallback(cv::v4d::detail::glfw_error_callback); + + if (debug_) + glfwWindowHint(GLFW_OPENGL_DEBUG_CONTEXT, GLFW_TRUE); + + glfwSetTime(0); +#ifdef __APPLE__ + glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); + glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 2); + glfwWindowHint(GLFW_OPENGL_FORWARD_COMPAT, GL_TRUE); + glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); +#elif defined(OPENCV_V4D_USE_ES3) + glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); + glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 0); + glfwWindowHint(GLFW_CONTEXT_CREATION_API, GLFW_EGL_CONTEXT_API); + glfwWindowHint(GLFW_CLIENT_API, GLFW_OPENGL_ES_API); +#else + glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, major_); + glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, minor_); + glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); + glfwWindowHint(GLFW_CONTEXT_CREATION_API, GLFW_EGL_CONTEXT_API); + glfwWindowHint(GLFW_CLIENT_API, GLFW_OPENGL_API) ; +#endif + glfwWindowHint(GLFW_SAMPLES, samples_); + glfwWindowHint(GLFW_RED_BITS, 8); + glfwWindowHint(GLFW_GREEN_BITS, 8); + glfwWindowHint(GLFW_BLUE_BITS, 8); + glfwWindowHint(GLFW_ALPHA_BITS, 8); + glfwWindowHint(GLFW_STENCIL_BITS, 8); + glfwWindowHint(GLFW_DEPTH_BITS, 24); + glfwWindowHint(GLFW_RESIZABLE, GLFW_TRUE); + glfwWindowHint(GLFW_VISIBLE, offscreen_ ? GLFW_FALSE : GLFW_TRUE ); + glfwWindowHint(GLFW_DOUBLEBUFFER, GLFW_TRUE); + + glfwWindow_ = glfwCreateWindow(framebufferSize_.width, framebufferSize_.height, title_.c_str(), nullptr, rootWindow_); + + + if (glfwWindow_ == nullptr) { + //retry with native api + glfwWindowHint(GLFW_CONTEXT_CREATION_API, GLFW_NATIVE_CONTEXT_API); + glfwWindow_ = glfwCreateWindow(framebufferSize_.width, framebufferSize_.height, title_.c_str(), nullptr, + rootWindow_); + + if (glfwWindow_ == nullptr) { + CV_Error(Error::StsError, "Unable to initialize window."); + } + } + + this->makeCurrent(); + + if(isRoot()) { + rootWindow_ = glfwWindow_; + glfwSwapInterval(1); + } else { + glfwSwapInterval(0); + } + +#if !defined(OPENCV_V4D_USE_ES3) + if (!parent_) { + GLenum err = glewInit(); + if (err != GLEW_OK && err != GLEW_ERROR_NO_GLX_DISPLAY) { + CV_Error(Error::StsError, "Could not initialize GLEW!"); + } + } +#endif + try { + if (isRoot() && isClGlSharingSupported()) + cv::ogl::ocl::initializeContextFromGL(); + else + clglSharing_ = false; + } catch (std::exception& ex) { + CV_LOG_WARNING(nullptr, "CL-GL sharing failed: %s" << ex.what()); + clglSharing_ = false; + } catch (...) { + CV_LOG_WARNING(nullptr, "CL-GL sharing failed with unknown error"); + clglSharing_ = false; + } +//#else +// clglSharing_ = false; +//#endif + + context_ = CLExecContext_t::getCurrent(); + + setup(); + if(isRoot()) { + glfwSetWindowUserPointer(getGLFWWindow(), getV4D().get()); + glfwSetCursorPosCallback(getGLFWWindow(), [](GLFWwindow* glfwWin, double x, double y) { + V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); + if(v4d->hasImguiCtx()) { + ImGui_ImplGlfw_CursorPosCallback(glfwWin, x, y); + if (!ImGui::GetIO().WantCaptureMouse) { + v4d->setMousePosition(cv::Point2f(float(x), float(y))); + } + } + }); + + glfwSetMouseButtonCallback(getGLFWWindow(), [](GLFWwindow* glfwWin, int button, int action, int modifiers) { + V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); + + if(v4d->hasImguiCtx()) { + ImGui_ImplGlfw_MouseButtonCallback(glfwWin, button, action, modifiers); + + if (!ImGui::GetIO().WantCaptureMouse) { + // Pass event further + } else { + // Do nothing, since imgui already reacted to mouse click. It would be weird if unrelated things started happening when you click something on UI. + } + } + }); + + glfwSetKeyCallback(getGLFWWindow(), [](GLFWwindow* glfwWin, int key, int scancode, int action, int mods) { + V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); + + if(v4d->hasImguiCtx()) { + ImGui_ImplGlfw_KeyCallback(glfwWin, key, scancode, action, mods); + if (!ImGui::GetIO().WantCaptureKeyboard) { + // Pass event further + } else { + // Do nothing, since imgui already reacted to mouse click. It would be weird if unrelated things started happening when you click something on UI. + } + } + }); + glfwSetCharCallback(getGLFWWindow(), [](GLFWwindow* glfwWin, unsigned int codepoint) { + V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); + + if(v4d->hasImguiCtx()) { + ImGui_ImplGlfw_CharCallback(glfwWin, codepoint); + } + }); +//// glfwSetDropCallback(getGLFWWindow(), [](GLFWwindow* glfwWin, int count, const char** filenames) { +//// V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); +//// }); +// +// glfwSetScrollCallback(getGLFWWindow(), [](GLFWwindow* glfwWin, double x, double y) { +// V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); +// if (v4d->hasImguiCtx()) { +// ImGui_ImplGlfw_ScrollCallback(glfwWin, x, y); +// } +// }); +// +// glfwSetWindowSizeCallback(getGLFWWindow(), +// [](GLFWwindow* glfwWin, int width, int height) { +// cerr << "glfwSetWindowSizeCallback: " << width << endl; +// run_sync_on_main<23>([glfwWin, width, height]() { +// V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); +// cv::Rect& vp = v4d->viewport(); +// cv::Size fbsz = v4d->framebufferSize(); +// vp.x = 0; +// vp.y = 0; +// vp.width = fbsz.width; +// vp.height = fbsz.height; +// }); +// }); +// +// glfwSetFramebufferSizeCallback(getGLFWWindow(), +// [](GLFWwindow* glfwWin, int width, int height) { +//// cerr << "glfwSetFramebufferSizeCallback: " << width << endl; +//// run_sync_on_main<22>([glfwWin, width, height]() { +//// V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); +////// v4d->makeCurrent(); +//// cv::Rect& vp = v4d->viewport(); +//// cv::Size fbsz = v4d->framebufferSize(); +//// vp.x = 0; +//// vp.y = 0; +//// vp.width = fbsz.width; +//// vp.height = fbsz.height; +//// +//// if(v4d->hasNguiCtx()) +//// v4d->nguiCtx().screen().resize_callback_event(width, height); +//// }); +//// if(v4d->isResizable()) { +//// v4d->nvgCtx().fbCtx()->teardown(); +//// v4d->glCtx().fbCtx()->teardown(); +//// v4d->fbCtx()->teardown(); +//// v4d->fbCtx()->setup(cv::Size(width, height)); +//// v4d->glCtx().fbCtx()->setup(cv::Size(width, height)); +//// v4d->nvgCtx().fbCtx()->setup(cv::Size(width, height)); +//// } +// }); + glfwSetWindowFocusCallback(getGLFWWindow(), [](GLFWwindow* glfwWin, int i) { + V4D* v4d = reinterpret_cast(glfwGetWindowUserPointer(glfwWin)); + if(v4d->getGLFWWindow() == glfwWin) { + v4d->setFocused(i == 1); + } + }); + } +} + +cv::Ptr FrameBufferContext::getV4D() { + return v4d_->self(); +} + +int FrameBufferContext::getIndex() { + return index_; +} + +void FrameBufferContext::setup() { + cv::Size sz = framebufferSize_; + CLExecScope_t clExecScope(getCLExecContext()); + framebuffer_.create(sz, CV_8UC4); + if(isRoot()) { + GL_CHECK(glGenFramebuffers(1, &frameBufferID_)); + GL_CHECK(glBindFramebuffer(GL_FRAMEBUFFER, frameBufferID_)); + GL_CHECK(glGenRenderbuffers(1, &renderBufferID_)); + + GL_CHECK(glGenTextures(1, &textureID_)); + GL_CHECK(glBindTexture(GL_TEXTURE_2D, textureID_)); + texture_ = new cv::ogl::Texture2D(sz, cv::ogl::Texture2D::RGBA, textureID_); + GL_CHECK(glPixelStorei(GL_UNPACK_ALIGNMENT, 1)); + GL_CHECK( + glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, sz.width, sz.height, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0)); + GL_CHECK(glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR)); + GL_CHECK(glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR)); + GL_CHECK( + glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, textureID_, 0)); + + GL_CHECK(glBindRenderbuffer(GL_RENDERBUFFER, renderBufferID_)); + GL_CHECK( + glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH24_STENCIL8, sz.width, sz.height)); + GL_CHECK( + glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, renderBufferID_)); + + assert(glCheckFramebufferStatus(GL_FRAMEBUFFER) == GL_FRAMEBUFFER_COMPLETE); + } else if(hasParent()) { + GL_CHECK(glGenFramebuffers(1, &frameBufferID_)); + GL_CHECK(glBindFramebuffer(GL_FRAMEBUFFER, frameBufferID_)); + GL_CHECK(glBindTexture(GL_TEXTURE_2D, textureID_)); + texture_ = new cv::ogl::Texture2D(sz, cv::ogl::Texture2D::RGBA, textureID_); + GL_CHECK(glPixelStorei(GL_UNPACK_ALIGNMENT, 1)); + GL_CHECK( + glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, sz.width, sz.height, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0)); + GL_CHECK(glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR)); + GL_CHECK(glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR)); + GL_CHECK( + glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, textureID_, 0)); + GL_CHECK(glBindRenderbuffer(GL_RENDERBUFFER, renderBufferID_)); + GL_CHECK( + glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH24_STENCIL8, sz.width, sz.height)); + GL_CHECK( + glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, renderBufferID_)); + assert(glCheckFramebufferStatus(GL_FRAMEBUFFER) == GL_FRAMEBUFFER_COMPLETE); + } else + CV_Assert(false); +} + +void FrameBufferContext::teardown() { + using namespace cv::ocl; + this->makeCurrent(); +#ifdef HAVE_OPENCL + if(cv::ocl::useOpenCL() && clImage_ != nullptr && !getCLExecContext().empty()) { + CLExecScope_t clExecScope(getCLExecContext()); + + cl_int status = 0; + cl_command_queue q = (cl_command_queue) Queue::getDefault().ptr(); + + status = clEnqueueReleaseGLObjects(q, 1, &clImage_, 0, NULL, NULL); + if (status != CL_SUCCESS) + CV_Error_(cv::Error::OpenCLApiCallError, ("OpenCL: clEnqueueReleaseGLObjects failed: %d", status)); + + status = clFinish(q); // TODO Use events + if (status != CL_SUCCESS) + CV_Error_(cv::Error::OpenCLApiCallError, ("OpenCL: clFinish failed: %d", status)); + + status = clReleaseMemObject(clImage_); // TODO RAII + if (status != CL_SUCCESS) + CV_Error_(cv::Error::OpenCLApiCallError, ("OpenCL: clReleaseMemObject failed: %d", status)); + clImage_ = nullptr; + } +#endif + glBindTexture(GL_TEXTURE_2D, 0); + glGetError(); + glBindRenderbuffer(GL_RENDERBUFFER, 0); + glGetError(); + glBindFramebuffer(GL_FRAMEBUFFER, 0); + glGetError(); + CV_Assert(texture_ != nullptr); + delete texture_; + GL_CHECK(glDeleteTextures(1, &textureID_)); + GL_CHECK(glDeleteRenderbuffers(1, &renderBufferID_)); + GL_CHECK(glDeleteFramebuffers(1, &frameBufferID_)); + this->makeNoneCurrent(); +} + +#ifdef HAVE_OPENCL +void FrameBufferContext::toGLTexture2D(cv::UMat& u, cv::ogl::Texture2D& texture) { + CV_Assert(clImage_ != nullptr); + + using namespace cv::ocl; + + cl_int status = 0; + cl_command_queue q = (cl_command_queue) context_.getQueue().ptr(); + cl_mem clBuffer = (cl_mem) u.handle(ACCESS_READ); + + size_t offset = 0; + size_t dst_origin[3] = { 0, 0, 0 }; + size_t region[3] = { (size_t) u.cols, (size_t) u.rows, 1 }; + status = clEnqueueCopyBufferToImage(q, clBuffer, clImage_, offset, dst_origin, region, 0, NULL, + NULL); + if (status != CL_SUCCESS) + throw std::runtime_error("OpenCL: clEnqueueCopyBufferToImage failed: " + std::to_string(status)); + + status = clEnqueueReleaseGLObjects(q, 1, &clImage_, 0, NULL, NULL); + if (status != CL_SUCCESS) + throw std::runtime_error("OpenCL: clEnqueueReleaseGLObjects failed: " + std::to_string(status)); +} + +void FrameBufferContext::fromGLTexture2D(const cv::ogl::Texture2D& texture, cv::UMat& u) { + using namespace cv::ocl; + + const int dtype = CV_8UC4; + int textureType = dtype; + + if (u.size() != texture.size() || u.type() != textureType) { + u.create(texture.size(), textureType); + } + + cl_command_queue q = (cl_command_queue) context_.getQueue().ptr(); + cl_int status = 0; + + if (clImage_ == nullptr) { + Context& ctx = context_.getContext(); + cl_context context = (cl_context) ctx.ptr(); + clImage_ = clCreateFromGLTexture(context, CL_MEM_READ_ONLY, 0x0DE1, 0, texture.texId(), + &status); + if (status != CL_SUCCESS) + throw std::runtime_error("OpenCL: clCreateFromGLTexture failed: " + std::to_string(status)); + } + + status = clEnqueueAcquireGLObjects(q, 1, &clImage_, 0, NULL, NULL); + if (status != CL_SUCCESS) + throw std::runtime_error("OpenCL: clEnqueueAcquireGLObjects failed: " + std::to_string(status)); + + cl_mem clBuffer = (cl_mem) u.handle(ACCESS_READ); + + size_t offset = 0; + size_t src_origin[3] = { 0, 0, 0 }; + size_t region[3] = { (size_t) u.cols, (size_t) u.rows, 1 }; + status = clEnqueueCopyImageToBuffer(q, clImage_, clBuffer, src_origin, region, offset, 0, NULL, + NULL); + if (status != CL_SUCCESS) + throw std::runtime_error("OpenCL: clEnqueueCopyImageToBuffer failed: " + std::to_string(status)); +} +#endif +const cv::Size& FrameBufferContext::size() const { + return framebufferSize_; +} + +void FrameBufferContext::copyTo(cv::UMat& dst) { + if(!getCLExecContext().empty()) { + CLExecScope_t clExecScope(getCLExecContext()); + FrameBufferContext::GLScope glScope(this, GL_FRAMEBUFFER); + FrameBufferContext::FrameBufferScope fbScope(this, framebuffer_); + framebuffer_.copyTo(dst); + } else { + FrameBufferContext::GLScope glScope(this, GL_FRAMEBUFFER); + FrameBufferContext::FrameBufferScope fbScope(this, framebuffer_); + framebuffer_.copyTo(dst); + } +} + +void FrameBufferContext::copyFrom(const cv::UMat& src) { + if(!getCLExecContext().empty()) { + CLExecScope_t clExecScope(getCLExecContext()); + FrameBufferContext::GLScope glScope(this, GL_FRAMEBUFFER); + FrameBufferContext::FrameBufferScope fbScope(this, framebuffer_); + src.copyTo(framebuffer_); + } else { + FrameBufferContext::GLScope glScope(this, GL_FRAMEBUFFER); + FrameBufferContext::FrameBufferScope fbScope(this, framebuffer_); + src.copyTo(framebuffer_); + } +} + +void FrameBufferContext::copyToRootWindow() { + GLScope scope(self_, GL_READ_FRAMEBUFFER); + GL_CHECK(glReadBuffer(GL_COLOR_ATTACHMENT0)); + + GL_CHECK(glActiveTexture(GL_TEXTURE0)); + GL_CHECK(glBindTexture(GL_TEXTURE_2D, onscreenTextureID_)); + GL_CHECK(glCopyTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 0, 0, size().width, size().height)); +} + +cv::ogl::Texture2D& FrameBufferContext::getTexture2D() { + return *texture_; +} + +GLFWwindow* FrameBufferContext::getGLFWWindow() const { + return glfwWindow_; +} + +CLExecContext_t& FrameBufferContext::getCLExecContext() { + return context_; +} + +void FrameBufferContext::blitFrameBufferToFrameBuffer(const cv::Rect& srcViewport, + const cv::Size& targetFbSize, GLuint targetFramebufferID, bool stretch, bool flipY) { + double hf = double(targetFbSize.height) / framebufferSize_.height; + double wf = double(targetFbSize.width) / framebufferSize_.width; + double f; + if (hf > wf) + f = wf; + else + f = hf; + + double fbws = framebufferSize_.width * f; + double fbhs = framebufferSize_.height * f; + + double marginw = (targetFbSize.width - framebufferSize_.width) / 2.0; + double marginh = (targetFbSize.height - framebufferSize_.height) / 2.0; + double marginws = (targetFbSize.width - fbws) / 2.0; + double marginhs = (targetFbSize.height - fbhs) / 2.0; + + GLint srcX0 = srcViewport.x; + GLint srcY0 = srcViewport.y; + GLint srcX1 = srcViewport.x + srcViewport.width; + GLint srcY1 = srcViewport.y + srcViewport.height; + GLint dstX0 = stretch ? marginws : marginw; + GLint dstY0 = stretch ? marginhs : marginh; + GLint dstX1 = stretch ? marginws + fbws : marginw + framebufferSize_.width; + GLint dstY1 = stretch ? marginhs + fbhs : marginh + framebufferSize_.height; + if(flipY) { + GLint tmp = dstY0; + dstY0 = dstY1; + dstY1 = tmp; + } + GL_CHECK(glBindFramebuffer(GL_DRAW_FRAMEBUFFER, targetFramebufferID)); + GL_CHECK(glBlitFramebuffer( srcX0, srcY0, srcX1, srcY1, + dstX0, dstY0, dstX1, dstY1, + GL_COLOR_BUFFER_BIT, GL_NEAREST)); +} + +cv::UMat& FrameBufferContext::fb() { + return framebuffer_; +} + +void FrameBufferContext::begin(GLenum framebufferTarget) { + this->makeCurrent(); + GL_CHECK(glBindFramebuffer(framebufferTarget, frameBufferID_)); + GL_CHECK(glBindTexture(GL_TEXTURE_2D, textureID_)); + GL_CHECK(glBindRenderbuffer(GL_RENDERBUFFER, renderBufferID_)); + GL_CHECK( + glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH24_STENCIL8, size().width, size().height)); + GL_CHECK( + glFramebufferRenderbuffer(framebufferTarget, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, renderBufferID_)); + GL_CHECK( + glFramebufferTexture2D(framebufferTarget, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, textureID_, 0)); + assert(glCheckFramebufferStatus(framebufferTarget) == GL_FRAMEBUFFER_COMPLETE); +} + +void FrameBufferContext::end() { + this->makeNoneCurrent(); +} + +void FrameBufferContext::download(cv::UMat& m) { + cv::Mat tmp = m.getMat(cv::ACCESS_WRITE); + assert(tmp.data != nullptr); + GL_CHECK(glReadPixels(0, 0, tmp.cols, tmp.rows, GL_RGBA, GL_UNSIGNED_BYTE, tmp.data)); + tmp.release(); + +} + +void FrameBufferContext::upload(const cv::UMat& m) { + cv::Mat tmp = m.getMat(cv::ACCESS_READ); + assert(tmp.data != nullptr); + GL_CHECK( + glTexSubImage2D( GL_TEXTURE_2D, 0, 0, 0, tmp.cols, tmp.rows, GL_RGBA, GL_UNSIGNED_BYTE, tmp.data)); + tmp.release(); +} + +void FrameBufferContext::acquireFromGL(cv::UMat& m) { +#ifdef HAVE_OPENCL + if (cv::ocl::useOpenCL() && clglSharing_) { + try { + GL_CHECK(fromGLTexture2D(getTexture2D(), m)); + } catch(...) { + clglSharing_ = false; + download(m); + } + return; + } +#endif + { + download(m); + } + //FIXME + cv::flip(m, m, 0); +} + +void FrameBufferContext::releaseToGL(cv::UMat& m) { + //FIXME + cv::flip(m, m, 0); +#ifdef HAVE_OPENCL + if (cv::ocl::useOpenCL() && clglSharing_) { + try { + GL_CHECK(toGLTexture2D(m, getTexture2D())); + } catch(...) { + clglSharing_ = false; + upload(m); + } + return; + } +#endif + { + upload(m); + } +} + +cv::Vec2f FrameBufferContext::position() { + int x, y; + glfwGetWindowPos(getGLFWWindow(), &x, &y); + return cv::Vec2f(x, y); +} + +float FrameBufferContext::pixelRatioX() { + float xscale, yscale; + glfwGetWindowContentScale(getGLFWWindow(), &xscale, &yscale); + + return xscale; +} + +float FrameBufferContext::pixelRatioY() { + float xscale, yscale; + glfwGetWindowContentScale(getGLFWWindow(), &xscale, &yscale); + + return yscale; +} + +void FrameBufferContext::makeCurrent() { + assert(getGLFWWindow() != nullptr); + glfwMakeContextCurrent(getGLFWWindow()); +} + +void FrameBufferContext::makeNoneCurrent() { + glfwMakeContextCurrent(nullptr); +} + + +bool FrameBufferContext::isResizable() { + return glfwGetWindowAttrib(getGLFWWindow(), GLFW_RESIZABLE) == GLFW_TRUE; +} + +void FrameBufferContext::setResizable(bool r) { + glfwSetWindowAttrib(getGLFWWindow(), GLFW_RESIZABLE, r ? GLFW_TRUE : GLFW_FALSE); +} + +void FrameBufferContext::setWindowSize(const cv::Size& sz) { + glfwSetWindowSize(getGLFWWindow(), sz.width, sz.height); +} + +//FIXME cache window size +cv::Size FrameBufferContext::getWindowSize() { + cv::Size sz; + glfwGetWindowSize(getGLFWWindow(), &sz.width, &sz.height); + return sz; +} + +bool FrameBufferContext::isFullscreen() { + return glfwGetWindowMonitor(getGLFWWindow()) != nullptr; +} + +void FrameBufferContext::setFullscreen(bool f) { + auto monitor = glfwGetPrimaryMonitor(); + const GLFWvidmode* mode = glfwGetVideoMode(monitor); + if (f) { + glfwSetWindowMonitor(getGLFWWindow(), monitor, 0, 0, mode->width, mode->height, + mode->refreshRate); + setWindowSize(getNativeFrameBufferSize()); + } else { + glfwSetWindowMonitor(getGLFWWindow(), nullptr, 0, 0, size().width, + size().height, 0); + setWindowSize(size()); + } +} + +cv::Size FrameBufferContext::getNativeFrameBufferSize() { + int w, h; + glfwGetFramebufferSize(getGLFWWindow(), &w, &h); + return cv::Size{w, h}; +} + +//cache window visibility instead of performing a heavy window attrib query. +bool FrameBufferContext::isVisible() { + return isVisible_; +} + +void FrameBufferContext::setVisible(bool v) { + isVisible_ = v; + if (isVisible_) + glfwShowWindow(getGLFWWindow()); + else + glfwHideWindow(getGLFWWindow()); +} + +bool FrameBufferContext::isClosed() { + return glfwWindow_ == nullptr; +} + +void FrameBufferContext::close() { + teardown(); + glfwDestroyWindow(getGLFWWindow()); + glfwWindow_ = nullptr; +} + +bool FrameBufferContext::isRoot() { + return isRoot_; +} + + +bool FrameBufferContext::hasParent() { + return hasParent_; +} + +bool FrameBufferContext::hasRootWindow() { + return rootWindow_ != nullptr; +} + +void FrameBufferContext::fence() { + CV_Assert(currentSyncObject_ == 0); + currentSyncObject_ = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0); + CV_Assert(currentSyncObject_ != 0); +} + +bool FrameBufferContext::wait(const uint64_t& timeout) { + if(firstSync_) { + currentSyncObject_ = 0; + firstSync_ = false; + return true; + } + CV_Assert(currentSyncObject_ != 0); + GLuint ret = glClientWaitSync(static_cast(currentSyncObject_), + GL_SYNC_FLUSH_COMMANDS_BIT, timeout); + GL_CHECK(); + CV_Assert(GL_WAIT_FAILED != ret); + if(GL_CONDITION_SATISFIED == ret || GL_ALREADY_SIGNALED == ret) { + currentSyncObject_ = 0; + return true; + } else { + currentSyncObject_ = 0; + return false; + } +} +} +} +} diff --git a/modules/v4d/src/detail/glcontext.cpp b/modules/v4d/src/detail/glcontext.cpp new file mode 100644 index 000000000..d4dbca45e --- /dev/null +++ b/modules/v4d/src/detail/glcontext.cpp @@ -0,0 +1,42 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/detail/glcontext.hpp" + +namespace cv { +namespace v4d { +namespace detail { +GLContext::GLContext(const int32_t& idx, cv::Ptr fbContext) : + idx_(idx), mainFbContext_(fbContext), glFbContext_(new FrameBufferContext(*fbContext->getV4D(), "OpenGL" + std::to_string(idx), fbContext)) { +} + +void GLContext::execute(std::function fn) { + if(!fbCtx()->hasParent()) { + UMat tmp; + mainFbContext_->copyTo(tmp); + fbCtx()->copyFrom(tmp); + } + { + FrameBufferContext::GLScope glScope(fbCtx(), GL_FRAMEBUFFER); + GL_CHECK(glClear(GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT)); + fn(); + } + if(!fbCtx()->hasParent()) { + UMat tmp; + fbCtx()->copyTo(tmp); + mainFbContext_->copyFrom(tmp); + } +} + +const int32_t& GLContext::getIndex() const { + return idx_; +} +cv::Ptr GLContext::fbCtx() { + return glFbContext_; +} + +} +} +} diff --git a/modules/v4d/src/detail/imguicontext.cpp b/modules/v4d/src/detail/imguicontext.cpp new file mode 100644 index 000000000..1f7b5d95e --- /dev/null +++ b/modules/v4d/src/detail/imguicontext.cpp @@ -0,0 +1,99 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/v4d.hpp" +#if defined(OPENCV_V4D_USE_ES3) || defined(EMSCRIPTEN) +# define IMGUI_IMPL_OPENGL_ES3 +#endif + +#define IMGUI_IMPL_OPENGL_LOADER_CUSTOM + +#include "imgui_impl_glfw.h" +#include "imgui_impl_opengl3.h" + +namespace cv { +namespace v4d { +namespace detail { +ImGuiContextImpl::ImGuiContextImpl(cv::Ptr fbContext) : + mainFbContext_(fbContext) { + FrameBufferContext::GLScope glScope(mainFbContext_, GL_FRAMEBUFFER); + IMGUI_CHECKVERSION(); + context_ = ImGui::CreateContext(); + ImGui::SetCurrentContext(context_); + + ImGuiIO& io = ImGui::GetIO(); + (void)io; + io.ConfigFlags |= ImGuiConfigFlags_NavEnableKeyboard; + io.ConfigFlags |= ImGuiConfigFlags_NavEnableGamepad; + ImGui::StyleColorsDark(); + + ImGui_ImplGlfw_InitForOpenGL(mainFbContext_->getGLFWWindow(), false); + ImGui_ImplGlfw_SetCallbacksChainForAllWindows(true); +#if !defined(OPENCV_V4D_USE_ES3) + ImGui_ImplOpenGL3_Init("#version 330"); +#else + ImGui_ImplOpenGL3_Init("#version 300 es"); +#endif +} + +void ImGuiContextImpl::build(std::function fn) { + renderCallback_ = fn; +} + +void ImGuiContextImpl::makeCurrent() { + ImGui::SetCurrentContext(context_); +} + +void ImGuiContextImpl::render(bool showFPS) { + mainFbContext_->makeCurrent(); + ImGui::SetCurrentContext(context_); + + GL_CHECK(glBindFramebuffer(GL_FRAMEBUFFER, 0)); +#if !defined(OPENCV_V4D_USE_ES3) + GL_CHECK(glDrawBuffer(GL_BACK)); +#endif + ImGui_ImplOpenGL3_NewFrame(); + ImGui_ImplGlfw_NewFrame(); + ImGui::NewFrame(); + if (showFPS) { + static bool open_ptr[1] = { true }; + static ImGuiWindowFlags window_flags = 0; +// window_flags |= ImGuiWindowFlags_NoBackground; + window_flags |= ImGuiWindowFlags_NoBringToFrontOnFocus; + window_flags |= ImGuiWindowFlags_NoMove; + window_flags |= ImGuiWindowFlags_NoScrollWithMouse; + window_flags |= ImGuiWindowFlags_AlwaysAutoResize; + window_flags |= ImGuiWindowFlags_NoSavedSettings; + window_flags |= ImGuiWindowFlags_NoFocusOnAppearing; + window_flags |= ImGuiWindowFlags_NoNav; + window_flags |= ImGuiWindowFlags_NoDecoration; + window_flags |= ImGuiWindowFlags_NoInputs; + static ImVec2 pos(0, 0); + ImGui::SetNextWindowPos(pos, ImGuiCond_Once); + ImGui::PushStyleColor(ImGuiCol_WindowBg, ImVec4(0.0f, 0.0f, 0.0f, 0.5f)); + ImGui::Begin("Display", open_ptr, window_flags); + ImGui::Text("%.3f ms/frame (%.1f FPS)", (1000.0f / Global::fps()) , Global::fps()); + ImGui::End(); + ImGui::PopStyleColor(1); + std::stringstream ss; + TimeTracker::getInstance()->print(ss); + std::string line; + ImGui::PushStyleColor(ImGuiCol_WindowBg, ImVec4(0.0f, 0.0f, 0.0f, 0.5f)); + ImGui::Begin("Time Tracking"); + while(getline(ss, line)) { + ImGui::Text("%s", line.c_str()); + } + ImGui::End(); + ImGui::PopStyleColor(1); + } + if (renderCallback_) + renderCallback_(context_); + ImGui::Render(); + ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData()); + mainFbContext_->makeNoneCurrent(); +} +} +} +} diff --git a/modules/v4d/src/detail/nanovgcontext.cpp b/modules/v4d/src/detail/nanovgcontext.cpp new file mode 100644 index 000000000..781ddb92d --- /dev/null +++ b/modules/v4d/src/detail/nanovgcontext.cpp @@ -0,0 +1,82 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/detail/nanovgcontext.hpp" +#include "opencv2/v4d/nvg.hpp" +#include "nanovg_gl.h" + +namespace cv { +namespace v4d { +namespace detail { + +NanoVGContext::NanoVGContext(cv::Ptr fbContext) : + mainFbContext_(fbContext), nvgFbContext_(new FrameBufferContext(*fbContext->getV4D(), "NanoVG", fbContext)), context_( + nullptr) { + FrameBufferContext::GLScope glScope(fbCtx(), GL_FRAMEBUFFER); +#if defined(OPENCV_V4D_USE_ES3) + context_ = nvgCreateGLES3(NVG_ANTIALIAS | NVG_STENCIL_STROKES); +#else + context_ = nvgCreateGL3(NVG_ANTIALIAS | NVG_STENCIL_STROKES); +#endif + if (!context_) + CV_Error(Error::StsError, "Could not initialize NanoVG!"); + nvgCreateFont(context_, "icons", "modules/v4d/assets/fonts/entypo.ttf"); + nvgCreateFont(context_, "sans", "modules/v4d/assets/fonts/Roboto-Regular.ttf"); + nvgCreateFont(context_, "sans-bold", "modules/v4d/assets/fonts/Roboto-Bold.ttf"); +} + +void NanoVGContext::execute(std::function fn) { + if (!fbCtx()->hasParent()) { + UMat tmp; + mainFbContext_->copyTo(tmp); + fbCtx()->copyFrom(tmp); + } + + { + FrameBufferContext::GLScope glScope(fbCtx(), GL_FRAMEBUFFER); + glClear(GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT); + NanoVGContext::Scope nvgScope(*this); + cv::v4d::nvg::detail::NVG::initializeContext(context_); + fn(); + } + + if (!fbCtx()->hasParent()) { + UMat tmp; + fbCtx()->copyTo(tmp); + mainFbContext_->copyFrom(tmp); + } +} + + +void NanoVGContext::begin() { + float w = fbCtx()->size().width; + float h = fbCtx()->size().height; + float ws = w / scale_.width; + float hs = h / scale_.height; + float r = fbCtx()->pixelRatioX(); + CV_UNUSED(ws); + CV_UNUSED(hs); + nvgSave(context_); + nvgBeginFrame(context_, w, h, r); + nvgTranslate(context_, 0, h - hs); +} + +void NanoVGContext::end() { + //FIXME make nvgCancelFrame possible + + nvgEndFrame(context_); + nvgRestore(context_); +} + +void NanoVGContext::setScale(const cv::Size_& scale) { + scale_ = scale; +} + +cv::Ptr NanoVGContext::fbCtx() { + return nvgFbContext_; +} +} +} +} diff --git a/modules/v4d/src/detail/nvg.cpp b/modules/v4d/src/detail/nvg.cpp new file mode 100644 index 000000000..dfc5b9228 --- /dev/null +++ b/modules/v4d/src/detail/nvg.cpp @@ -0,0 +1,45 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/nvg.hpp" +#include +#include "opencv2/core.hpp" + +#include + +namespace cv { +namespace v4d { +/*! + * In general please refere to https://github.com/memononen/nanovg/blob/master/src/nanovg.h for reference. + */ +namespace nvg { +Paint::Paint(const NVGpaint& np) { + memcpy(this->xform, np.xform, sizeof(this->xform)); + memcpy(this->extent, np.extent, sizeof(this->extent)); + this->radius = np.radius; + this->feather = np.feather; + this->innerColor = cv::Scalar(np.innerColor.rgba[2] * 255, np.innerColor.rgba[1] * 255, + np.innerColor.rgba[0] * 255, np.innerColor.rgba[3] * 255); + this->outerColor = cv::Scalar(np.outerColor.rgba[2] * 255, np.outerColor.rgba[1] * 255, + np.outerColor.rgba[0] * 255, np.outerColor.rgba[3] * 255); + this->image = np.image; +} + +NVGpaint Paint::toNVGpaint() { + NVGpaint np; + memcpy(np.xform, this->xform, sizeof(this->xform)); + memcpy(np.extent, this->extent, sizeof(this->extent)); + np.radius = this->radius; + np.feather = this->feather; + np.innerColor = nvgRGBA(this->innerColor[2], this->innerColor[1], this->innerColor[0], + this->innerColor[3]); + np.outerColor = nvgRGBA(this->outerColor[2], this->outerColor[1], this->outerColor[0], + this->outerColor[3]); + np.image = this->image; + return np; +} +} +} +} diff --git a/modules/v4d/src/detail/sinkcontext.cpp b/modules/v4d/src/detail/sinkcontext.cpp new file mode 100644 index 000000000..77ed12ebe --- /dev/null +++ b/modules/v4d/src/detail/sinkcontext.cpp @@ -0,0 +1,48 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "../../include/opencv2/v4d/detail/sinkcontext.hpp" +#include "../../include/opencv2/v4d/v4d.hpp" + +#include + +namespace cv { +namespace v4d { +namespace detail { + +SinkContext::SinkContext(cv::Ptr mainFbContext) : mainFbContext_(mainFbContext) { +} + +void SinkContext::execute(std::function fn) { + if (hasContext()) { + CLExecScope_t scope(getCLExecContext()); + fn(); + } else { + fn(); + } + auto v4d = mainFbContext_->getV4D(); + if(v4d->hasSink()) { + v4d->getSink()->operator ()(v4d->sourceCtx()->sequenceNumber(), sinkBuffer()); + } +} + +bool SinkContext::hasContext() { + return !context_.empty(); +} + +void SinkContext::copyContext() { + context_ = CLExecContext_t::getCurrent(); +} + +CLExecContext_t SinkContext::getCLExecContext() { + return context_; +} + +cv::UMat& SinkContext::sinkBuffer() { + return sinkBuffer_; +} +} +} +} diff --git a/modules/v4d/src/detail/sourcecontext.cpp b/modules/v4d/src/detail/sourcecontext.cpp new file mode 100644 index 000000000..b876305af --- /dev/null +++ b/modules/v4d/src/detail/sourcecontext.cpp @@ -0,0 +1,76 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "../../include/opencv2/v4d/detail/sourcecontext.hpp" +#include "../../include/opencv2/v4d/v4d.hpp" +#include + +namespace cv { +namespace v4d { +namespace detail { + +SourceContext::SourceContext(cv::Ptr mainFbContext) : mainFbContext_(mainFbContext) { +} + +void SourceContext::execute(std::function fn) { + if (hasContext()) { + CLExecScope_t scope(getCLExecContext()); + if (mainFbContext_->getV4D()->hasSource()) { + auto src = mainFbContext_->getV4D()->getSource(); + + if(src->isOpen()) { + auto p = src->operator ()(); + currentSeqNr_ = p.first; + + if(p.second.empty()) { + CV_Error(cv::Error::StsError, "End of stream"); + } + + resizePreserveAspectRatio(p.second, captureBufferRGB_, mainFbContext_->size()); + cv::cvtColor(captureBufferRGB_, sourceBuffer(), cv::COLOR_RGB2BGRA); + } + } + fn(); + } else { + if (mainFbContext_->getV4D()->hasSource()) { + auto src = mainFbContext_->getV4D()->getSource(); + + if(src->isOpen()) { + auto p = src->operator ()(); + currentSeqNr_ = p.first; + + if(p.second.empty()) { + CV_Error(cv::Error::StsError, "End of stream"); + } + resizePreserveAspectRatio(p.second, captureBufferRGB_, mainFbContext_->size()); + cv::cvtColor(captureBufferRGB_, sourceBuffer(), cv::COLOR_RGB2BGRA); + } + } + fn(); + } +} + +uint64_t SourceContext::sequenceNumber() { + return currentSeqNr_; +} + +bool SourceContext::hasContext() { + return !context_.empty(); +} + +void SourceContext::copyContext() { + context_ = CLExecContext_t::getCurrent(); +} + +CLExecContext_t SourceContext::getCLExecContext() { + return context_; +} + +cv::UMat& SourceContext::sourceBuffer() { + return captureBuffer_; +} +} +} +} diff --git a/modules/v4d/src/detail/timetracker.cpp b/modules/v4d/src/detail/timetracker.cpp new file mode 100644 index 000000000..6f7a1e2b0 --- /dev/null +++ b/modules/v4d/src/detail/timetracker.cpp @@ -0,0 +1,16 @@ +/* + * time_tracker.cpp + * + * Created on: Mar 22, 2014 + * Author: elchaschab + */ + +#include "opencv2/v4d/detail/timetracker.hpp" + +TimeTracker* TimeTracker::instance_; + +TimeTracker::TimeTracker() : enabled_(false) { +} + +TimeTracker::~TimeTracker() { +} diff --git a/modules/v4d/src/nvg.cpp b/modules/v4d/src/nvg.cpp new file mode 100644 index 000000000..fcc186e6c --- /dev/null +++ b/modules/v4d/src/nvg.cpp @@ -0,0 +1,716 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/nvg.hpp" +#include "opencv2/v4d/v4d.hpp" + +namespace cv { +namespace v4d { +namespace nvg { +namespace detail { +class NVG; + +thread_local NVG* NVG::nvg_instance_ = nullptr; + +void NVG::initializeContext(NVGcontext* ctx) { + if (nvg_instance_ != nullptr) + delete nvg_instance_; + nvg_instance_ = new NVG(ctx); +} + +NVG* NVG::getCurrentContext() { + assert(nvg_instance_ != nullptr); + return nvg_instance_; +} + +int NVG::createFont(const char* name, const char* filename) { + return nvgCreateFont(getContext(), name, filename); +} + +int NVG::createFontMem(const char* name, unsigned char* data, int ndata, int freeData) { + return nvgCreateFontMem(getContext(), name, data, ndata, freeData); +} + +int NVG::findFont(const char* name) { + return nvgFindFont(getContext(), name); +} + +int NVG::addFallbackFontId(int baseFont, int fallbackFont) { + return nvgAddFallbackFontId(getContext(), baseFont, fallbackFont); +} + +int NVG::addFallbackFont(const char* baseFont, const char* fallbackFont) { + return nvgAddFallbackFont(getContext(), baseFont, fallbackFont); +} + +void NVG::fontSize(float size) { + nvgFontSize(getContext(), size); +} + +void NVG::fontBlur(float blur) { + nvgFontBlur(getContext(), blur); +} + +void NVG::textLetterSpacing(float spacing) { + nvgTextLetterSpacing(getContext(), spacing); +} + +void NVG::textLineHeight(float lineHeight) { + nvgTextLineHeight(getContext(), lineHeight); +} + +void NVG::textAlign(int align) { + nvgTextAlign(getContext(), align); +} + +void NVG::fontFaceId(int font) { + nvgFontFaceId(getContext(), font); +} + +void NVG::fontFace(const char* font) { + nvgFontFace(getContext(), font); +} + +float NVG::text(float x, float y, const char* string, const char* end) { + return nvgText(getContext(), x, y, string, end); +} + +void NVG::textBox(float x, float y, float breakRowWidth, const char* string, const char* end) { + nvgTextBox(getContext(), x, y, breakRowWidth, string, end); +} + +float NVG::textBounds(float x, float y, const char* string, const char* end, float* bounds) { + return nvgTextBounds(getContext(), x, y, string, end, bounds); +} + +void NVG::textBoxBounds(float x, float y, float breakRowWidth, const char* string, const char* end, + float* bounds) { + nvgTextBoxBounds(getContext(), x, y, breakRowWidth, string, end, bounds); +} + +int NVG::textGlyphPositions(float x, float y, const char* string, const char* end, + GlyphPosition* positions, int maxPositions) { + return nvgTextGlyphPositions(getContext(), x, y, string, end, positions, maxPositions); +} + +void NVG::textMetrics(float* ascender, float* descender, float* lineh) { + nvgTextMetrics(getContext(), ascender, descender, lineh); +} + +int NVG::textBreakLines(const char* string, const char* end, float breakRowWidth, TextRow* rows, + int maxRows) { + return nvgTextBreakLines(getContext(), string, end, breakRowWidth, rows, maxRows); +} + +void NVG::save() { + nvgSave(getContext()); +} + +void NVG::restore() { + nvgRestore(getContext()); +} + +void NVG::reset() { + nvgReset(getContext()); +} + +//void NVG::shapeAntiAlias(int enabled) { +// nvgShapeAntiAlias(getContext(), enabled); +//} + +void NVG::strokeColor(const cv::Scalar& bgra) { + nvgStrokeColor(getContext(), nvgRGBA(bgra[2], bgra[1], bgra[0], bgra[3])); +} + +void NVG::strokePaint(Paint paint) { + NVGpaint np = paint.toNVGpaint(); + nvgStrokePaint(getContext(), np); +} + +void NVG::fillColor(const cv::Scalar& rgba) { + nvgFillColor(getContext(), nvgRGBA(rgba[0], rgba[1], rgba[2], rgba[3])); +} + +void NVG::fillPaint(Paint paint) { + NVGpaint np = paint.toNVGpaint(); + nvgFillPaint(getContext(), np); +} + +void NVG::miterLimit(float limit) { + nvgMiterLimit(getContext(), limit); +} + +void NVG::strokeWidth(float size) { + nvgStrokeWidth(getContext(), size); +} + +void NVG::lineCap(int cap) { + nvgLineCap(getContext(), cap); +} + +void NVG::lineJoin(int join) { + nvgLineJoin(getContext(), join); +} + +void NVG::globalAlpha(float alpha) { + nvgGlobalAlpha(getContext(), alpha); +} + +void NVG::resetTransform() { + nvgResetTransform(getContext()); +} + +void NVG::transform(float a, float b, float c, float d, float e, float f) { + nvgTransform(getContext(), a, b, c, d, e, f); +} + +void NVG::translate(float x, float y) { + nvgTranslate(getContext(), x, y); +} + +void NVG::rotate(float angle) { + nvgRotate(getContext(), angle); +} + +void NVG::skewX(float angle) { + nvgSkewX(getContext(), angle); +} + +void NVG::skewY(float angle) { + nvgSkewY(getContext(), angle); +} + +void NVG::scale(float x, float y) { + nvgScale(getContext(), x, y); +} + +void NVG::currentTransform(float* xform) { + nvgCurrentTransform(getContext(), xform); +} + +void NVG::transformIdentity(float* dst) { + nvgTransformIdentity(dst); +} + +void NVG::transformTranslate(float* dst, float tx, float ty) { + nvgTransformTranslate(dst, tx, ty); +} + +void NVG::transformScale(float* dst, float sx, float sy) { + nvgTransformScale(dst, sx, sy); +} + +void NVG::transformRotate(float* dst, float a) { + nvgTransformRotate(dst, a); +} + +void NVG::transformSkewX(float* dst, float a) { + nvgTransformSkewX(dst, a); +} + +void NVG::transformSkewY(float* dst, float a) { + nvgTransformSkewY(dst, a); +} + +void NVG::transformMultiply(float* dst, const float* src) { + nvgTransformMultiply(dst, src); +} + +void NVG::transformPremultiply(float* dst, const float* src) { + nvgTransformPremultiply(dst, src); +} + +int NVG::transformInverse(float* dst, const float* src) { + return nvgTransformInverse(dst, src); +} + +void NVG::transformPoint(float* dstx, float* dsty, const float* xform, float srcx, float srcy) { + nvgTransformPoint(dstx, dsty, xform, srcx, srcy); +} + +float NVG::degToRad(float deg) { + return nvgDegToRad(deg); +} + +float NVG::radToDeg(float rad) { + return nvgRadToDeg(rad); +} + +int NVG::createImage(const char* filename, int imageFlags) { + return nvgCreateImage(getContext(), filename, imageFlags); +} + +int NVG::createImageMem(int imageFlags, unsigned char* data, int ndata) { + return nvgCreateImageMem(getContext(), imageFlags, data, ndata); +} + +int NVG::createImageRGBA(int w, int h, int imageFlags, const unsigned char* data) { + return nvgCreateImageRGBA(getContext(), w, h, imageFlags, data); +} + +void NVG::updateImage(int image, const unsigned char* data) { + nvgUpdateImage(getContext(), image, data); +} + +void NVG::imageSize(int image, int* w, int* h) { + nvgImageSize(getContext(), image, w, h); +} + +void NVG::deleteImage(int image) { + nvgDeleteImage(getContext(), image); +} + +void NVG::beginPath() { + nvgBeginPath(getContext()); +} + +void NVG::moveTo(float x, float y) { + nvgMoveTo(getContext(), x, y); +} + +void NVG::lineTo(float x, float y) { + nvgLineTo(getContext(), x, y); +} + +void NVG::bezierTo(float c1x, float c1y, float c2x, float c2y, float x, float y) { + nvgBezierTo(getContext(), c1x, c1y, c2x, c2y, x, y); +} + +void NVG::quadTo(float cx, float cy, float x, float y) { + nvgQuadTo(getContext(), cx, cy, x, y); +} + +void NVG::arcTo(float x1, float y1, float x2, float y2, float radius) { + nvgArcTo(getContext(), x1, y1, x2, y2, radius); +} + +void NVG::closePath() { + nvgClosePath(getContext()); +} + +void NVG::pathWinding(int dir) { + nvgPathWinding(getContext(), dir); +} + +void NVG::arc(float cx, float cy, float r, float a0, float a1, int dir) { + nvgArc(getContext(), cx, cy, r, a0, a1, dir); +} + +void NVG::rect(float x, float y, float w, float h) { + nvgRect(getContext(), x, y, w, h); +} + +void NVG::roundedRect(float x, float y, float w, float h, float r) { + nvgRoundedRect(getContext(), x, y, w, h, r); +} + +void NVG::roundedRectVarying(float x, float y, float w, float h, float radTopLeft, + float radTopRight, float radBottomRight, float radBottomLeft) { + nvgRoundedRectVarying(getContext(), x, y, w, h, radTopLeft, radTopRight, radBottomRight, + radBottomLeft); +} + +void NVG::ellipse(float cx, float cy, float rx, float ry) { + nvgEllipse(getContext(), cx, cy, rx, ry); +} + +void NVG::circle(float cx, float cy, float r) { + nvgCircle(getContext(), cx, cy, r); +} + +void NVG::fill() { + nvgFill(getContext()); +} + +void NVG::stroke() { + nvgStroke(getContext()); +} + +Paint NVG::linearGradient(float sx, float sy, float ex, float ey, const cv::Scalar& icol, + const cv::Scalar& ocol) { + NVGpaint np = nvgLinearGradient(getContext(), sx, sy, ex, ey, + nvgRGBA(icol[2], icol[1], icol[0], icol[3]), + nvgRGBA(ocol[2], ocol[1], ocol[0], ocol[3])); + return Paint(np); +} + +Paint NVG::boxGradient(float x, float y, float w, float h, float r, float f, const cv::Scalar& icol, + const cv::Scalar& ocol) { + NVGpaint np = nvgBoxGradient(getContext(), x, y, w, h, r, f, + nvgRGBA(icol[2], icol[1], icol[0], icol[3]), + nvgRGBA(ocol[2], ocol[1], ocol[0], ocol[3])); + return Paint(np); +} + +Paint NVG::radialGradient(float cx, float cy, float inr, float outr, const cv::Scalar& icol, + const cv::Scalar& ocol) { + NVGpaint np = nvgRadialGradient(getContext(), cx, cy, inr, outr, + nvgRGBA(icol[2], icol[1], icol[0], icol[3]), + nvgRGBA(ocol[2], ocol[1], ocol[0], ocol[3])); + return Paint(np); +} + +Paint NVG::imagePattern(float ox, float oy, float ex, float ey, float angle, int image, + float alpha) { + NVGpaint np = nvgImagePattern(getContext(), ox, oy, ex, ey, angle, image, alpha); + return Paint(np); +} + +void NVG::scissor(float x, float y, float w, float h) { + nvgScissor(getContext(), x, y, w, h); +} + +void NVG::intersectScissor(float x, float y, float w, float h) { + nvgIntersectScissor(getContext(), x, y, w, h); +} + +void NVG::resetScissor() { + nvgResetScissor(getContext()); +} +} + +int createFont(const char* name, const char* filename) { + return detail::NVG::getCurrentContext()->createFont(name, filename); +} + +int createFontMem(const char* name, unsigned char* data, int ndata, int freeData) { + return detail::NVG::getCurrentContext()->createFontMem(name, data, ndata, freeData); +} + +int findFont(const char* name) { + return detail::NVG::getCurrentContext()->findFont(name); +} + +int addFallbackFontId(int baseFont, int fallbackFont) { + return detail::NVG::getCurrentContext()->addFallbackFontId(baseFont, fallbackFont); +} +int addFallbackFont(const char* baseFont, const char* fallbackFont) { + return detail::NVG::getCurrentContext()->addFallbackFont(baseFont, fallbackFont); +} + +void fontSize(float size) { + detail::NVG::getCurrentContext()->fontSize(size); +} + +void fontBlur(float blur) { + detail::NVG::getCurrentContext()->fontBlur(blur); +} + +void textLetterSpacing(float spacing) { + detail::NVG::getCurrentContext()->textLetterSpacing(spacing); +} + +void textLineHeight(float lineHeight) { + detail::NVG::getCurrentContext()->textLineHeight(lineHeight); +} + +void textAlign(int align) { + detail::NVG::getCurrentContext()->textAlign(align); +} + +void fontFaceId(int font) { + detail::NVG::getCurrentContext()->fontFaceId(font); +} + +void fontFace(const char* font) { + detail::NVG::getCurrentContext()->fontFace(font); +} + +float text(float x, float y, const char* string, const char* end) { + return detail::NVG::getCurrentContext()->text(x, y, string, end); +} + +void textBox(float x, float y, float breakRowWidth, const char* string, const char* end) { + detail::NVG::getCurrentContext()->textBox(x, y, breakRowWidth, string, end); +} + +float textBounds(float x, float y, const char* string, const char* end, float* bounds) { + return detail::NVG::getCurrentContext()->textBounds(x, y, string, end, bounds); +} + +void textBoxBounds(float x, float y, float breakRowWidth, const char* string, const char* end, + float* bounds) { + detail::NVG::getCurrentContext()->textBoxBounds(x, y, breakRowWidth, string, end, bounds); +} + +int textGlyphPositions(float x, float y, const char* string, const char* end, + GlyphPosition* positions, int maxPositions) { + return detail::NVG::getCurrentContext()->textGlyphPositions(x, y, string, end, positions, + maxPositions); +} + +void textMetrics(float* ascender, float* descender, float* lineh) { + detail::NVG::getCurrentContext()->textMetrics(ascender, descender, lineh); +} + +int textBreakLines(const char* string, const char* end, float breakRowWidth, TextRow* rows, + int maxRows) { + return detail::NVG::getCurrentContext()->textBreakLines(string, end, breakRowWidth, rows, + maxRows); +} + +void save() { + detail::NVG::getCurrentContext()->save(); +} + +void restore() { + detail::NVG::getCurrentContext()->restore(); +} + +void reset() { + detail::NVG::getCurrentContext()->reset(); +} + +//void shapeAntiAlias(int enabled) { +// detail::NVG::getCurrentContext()->strokeColor(enabled); +//} + +void strokeColor(const cv::Scalar& bgra) { + detail::NVG::getCurrentContext()->strokeColor(bgra); +} + +void strokePaint(Paint paint) { + detail::NVG::getCurrentContext()->strokePaint(paint); +} + +void fillColor(const cv::Scalar& color) { + detail::NVG::getCurrentContext()->fillColor(color); +} + +void fillPaint(Paint paint) { + detail::NVG::getCurrentContext()->fillPaint(paint); +} + +void miterLimit(float limit) { + detail::NVG::getCurrentContext()->miterLimit(limit); +} + +void strokeWidth(float size) { + detail::NVG::getCurrentContext()->strokeWidth(size); +} + +void lineCap(int cap) { + detail::NVG::getCurrentContext()->lineCap(cap); +} + +void lineJoin(int join) { + detail::NVG::getCurrentContext()->lineJoin(join); +} + +void globalAlpha(float alpha) { + detail::NVG::getCurrentContext()->globalAlpha(alpha); +} + +void resetTransform() { + detail::NVG::getCurrentContext()->resetTransform(); +} + +void transform(float a, float b, float c, float d, float e, float f) { + detail::NVG::getCurrentContext()->transform(a, b, c, d, e, f); +} + +void translate(float x, float y) { + detail::NVG::getCurrentContext()->translate(x, y); +} + +void rotate(float angle) { + detail::NVG::getCurrentContext()->rotate(angle); +} + +void skewX(float angle) { + detail::NVG::getCurrentContext()->skewX(angle); +} + +void skewY(float angle) { + detail::NVG::getCurrentContext()->skewY(angle); +} + +void scale(float x, float y) { + detail::NVG::getCurrentContext()->scale(x, y); +} + +void currentTransform(float* xform) { + detail::NVG::getCurrentContext()->currentTransform(xform); +} + +void transformIdentity(float* dst) { + detail::NVG::getCurrentContext()->transformIdentity(dst); +} + +void transformTranslate(float* dst, float tx, float ty) { + detail::NVG::getCurrentContext()->transformTranslate(dst, tx, ty); +} + +void transformScale(float* dst, float sx, float sy) { + detail::NVG::getCurrentContext()->transformScale(dst, sx, sy); +} + +void transformRotate(float* dst, float a) { + detail::NVG::getCurrentContext()->transformRotate(dst, a); +} + +void transformSkewX(float* dst, float a) { + detail::NVG::getCurrentContext()->transformSkewX(dst, a); +} + +void transformSkewY(float* dst, float a) { + detail::NVG::getCurrentContext()->transformSkewY(dst, a); +} + +void transformMultiply(float* dst, const float* src) { + detail::NVG::getCurrentContext()->transformMultiply(dst, src); +} + +void transformPremultiply(float* dst, const float* src) { + detail::NVG::getCurrentContext()->transformPremultiply(dst, src); +} + +int transformInverse(float* dst, const float* src) { + return detail::NVG::getCurrentContext()->transformInverse(dst, src); +} + +void transformPoint(float* dstx, float* dsty, const float* xform, float srcx, float srcy) { + return detail::NVG::getCurrentContext()->transformPoint(dstx, dsty, xform, srcx, srcy); +} + +float degToRad(float deg) { + return detail::NVG::getCurrentContext()->degToRad(deg); +} + +float radToDeg(float rad) { + return detail::NVG::getCurrentContext()->radToDeg(rad); +} + +int createImage(const char* filename, int imageFlags) { + return detail::NVG::getCurrentContext()->createImage(filename, imageFlags); +} + +int createImageMem(int imageFlags, unsigned char* data, int ndata) { + return detail::NVG::getCurrentContext()->createImageMem(imageFlags, data, ndata); +} + +int createImageRGBA(int w, int h, int imageFlags, const unsigned char* data) { + return detail::NVG::getCurrentContext()->createImageRGBA(w, h, imageFlags, data); +} + +void updateImage(int image, const unsigned char* data) { + detail::NVG::getCurrentContext()->updateImage(image, data); +} + +void imageSize(int image, int* w, int* h) { + detail::NVG::getCurrentContext()->imageSize(image, w, h); +} + +void deleteImage(int image) { + detail::NVG::getCurrentContext()->deleteImage(image); +} + +void beginPath() { + detail::NVG::getCurrentContext()->beginPath(); +} +void moveTo(float x, float y) { + detail::NVG::getCurrentContext()->moveTo(x, y); +} + +void lineTo(float x, float y) { + detail::NVG::getCurrentContext()->lineTo(x, y); +} + +void bezierTo(float c1x, float c1y, float c2x, float c2y, float x, float y) { + detail::NVG::getCurrentContext()->bezierTo(c1x, c1y, c2x, c2y, x, y); +} + +void quadTo(float cx, float cy, float x, float y) { + detail::NVG::getCurrentContext()->quadTo(cx, cy, x, y); +} + +void arcTo(float x1, float y1, float x2, float y2, float radius) { + detail::NVG::getCurrentContext()->arcTo(x1, y1, x2, y2, radius); +} + +void closePath() { + detail::NVG::getCurrentContext()->closePath(); +} + +void pathWinding(int dir) { + detail::NVG::getCurrentContext()->pathWinding(dir); +} + +void arc(float cx, float cy, float r, float a0, float a1, int dir) { + detail::NVG::getCurrentContext()->arc(cx, cy, r, a0, a1, dir); +} + +void rect(float x, float y, float w, float h) { + detail::NVG::getCurrentContext()->rect(x, y, w, h); +} + +void roundedRect(float x, float y, float w, float h, float r) { + detail::NVG::getCurrentContext()->roundedRect(x, y, w, h, r); +} + +void roundedRectVarying(float x, float y, float w, float h, float radTopLeft, float radTopRight, + float radBottomRight, float radBottomLeft) { + detail::NVG::getCurrentContext()->roundedRectVarying(x, y, w, h, radTopLeft, radTopRight, + radBottomRight, radBottomLeft); +} + +void ellipse(float cx, float cy, float rx, float ry) { + detail::NVG::getCurrentContext()->ellipse(cx, cy, rx, ry); +} + +void circle(float cx, float cy, float r) { + detail::NVG::getCurrentContext()->circle(cx, cy, r); +} + +void fill() { + detail::NVG::getCurrentContext()->fill(); +} + +void stroke() { + detail::NVG::getCurrentContext()->stroke(); +} + +Paint linearGradient(float sx, float sy, float ex, float ey, const cv::Scalar& icol, + const cv::Scalar& ocol) { + return detail::NVG::getCurrentContext()->linearGradient(sx, sy, ex, ey, icol, ocol); +} + +Paint boxGradient(float x, float y, float w, float h, float r, float f, const cv::Scalar& icol, + const cv::Scalar& ocol) { + return detail::NVG::getCurrentContext()->boxGradient(x, y, w, h, r, f, icol, ocol); +} + +Paint radialGradient(float cx, float cy, float inr, float outr, const cv::Scalar& icol, + const cv::Scalar& ocol) { + return detail::NVG::getCurrentContext()->radialGradient(cx, cy, inr, outr, icol, ocol); +} + +Paint imagePattern(float ox, float oy, float ex, float ey, float angle, int image, float alpha) { + return detail::NVG::getCurrentContext()->imagePattern(ox, oy, ex, ey, angle, image, alpha); +} + +void scissor(float x, float y, float w, float h) { + detail::NVG::getCurrentContext()->scissor(x, y, w, h); +} + +void intersectScissor(float x, float y, float w, float h) { + detail::NVG::getCurrentContext()->intersectScissor(x, y, w, h); +} + +void resetScissor() { + detail::NVG::getCurrentContext()->resetScissor(); +} + +void clear(const cv::Scalar& bgra) { + const float& b = bgra[0] / 255.0f; + const float& g = bgra[1] / 255.0f; + const float& r = bgra[2] / 255.0f; + const float& a = bgra[3] / 255.0f; + GL_CHECK(glClearColor(r, g, b, a)); + GL_CHECK(glClear(GL_COLOR_BUFFER_BIT | GL_STENCIL_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)); +} +} +} +} diff --git a/modules/v4d/src/resequence.cpp b/modules/v4d/src/resequence.cpp new file mode 100644 index 000000000..f0c2b3091 --- /dev/null +++ b/modules/v4d/src/resequence.cpp @@ -0,0 +1,33 @@ +#include "../include/opencv2/v4d/detail/resequence.hpp" +#include + +namespace cv { +namespace v4d { + void Resequence::finish() { + std::unique_lock lock(putMtx_); + finish_ = true; + notify(); + } + + void Resequence::notify() { + cv_.notify_all(); + } + + void Resequence::waitFor(const uint64_t& seq) { + while(true) { + { + std::unique_lock lock(putMtx_); + if(finish_) + break; + + if(seq == nextSeq_) { + ++nextSeq_; + break; + } + } + std::unique_lock lock(waitMtx_); + cv_.wait(lock); + } + } +} /* namespace v4d */ +} /* namespace cv */ diff --git a/modules/v4d/src/scene.cpp b/modules/v4d/src/scene.cpp new file mode 100644 index 000000000..b223f82c7 --- /dev/null +++ b/modules/v4d/src/scene.cpp @@ -0,0 +1,480 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + + +#include "../include/opencv2/v4d/scene.hpp" +#include +#include +#include +#include + +namespace cv { +namespace v4d { +namespace gl { + +#include + +cv::Vec3f cross(const cv::Vec3f& v1, const cv::Vec3f& v2) { + return cv::Vec3f(v1[1] * v2[2] - v1[2] * v2[1], + v1[2] * v2[0] - v1[0] * v2[2], + v1[0] * v2[1] - v1[1] * v2[0]); +} + +void releaseAssimpScene(const aiScene* scene) { + if (scene) { + for (unsigned int i = 0; i < scene->mNumMeshes; ++i) { + delete[] scene->mMeshes[i]->mVertices; + delete[] scene->mMeshes[i]->mNormals; + for (unsigned int j = 0; j < scene->mMeshes[i]->mNumFaces; ++j) { + delete[] scene->mMeshes[i]->mFaces[j].mIndices; + } + delete[] scene->mMeshes[i]->mFaces; + delete scene->mMeshes[i]; + } + + delete[] scene->mMeshes; + delete scene->mRootNode; + delete scene; + } +} + +aiScene* createAssimpScene(std::vector& vertices) { + if (vertices.size() % 3 != 0) { + vertices.resize(vertices.size() / 3); + } + + aiScene* scene = new aiScene(); + aiMesh* mesh = new aiMesh(); + + // Set vertices + mesh->mVertices = new aiVector3D[vertices.size()]; + for (size_t i = 0; i < vertices.size(); ++i) { + mesh->mVertices[i] = aiVector3D(vertices[i].x, vertices[i].y, vertices[i].z); + } + mesh->mNumVertices = static_cast(vertices.size()); + + // Generate normals + mesh->mNormals = new aiVector3D[mesh->mNumVertices]; + std::fill(mesh->mNormals, mesh->mNormals + mesh->mNumVertices, aiVector3D(0.0f, 0.0f, 0.0f)); + + size_t numFaces = vertices.size() / 3; // Assuming each face has 3 vertices + mesh->mFaces = new aiFace[numFaces]; + mesh->mNumFaces = static_cast(numFaces); + + for (size_t i = 0; i < numFaces; ++i) { + aiFace& face = mesh->mFaces[i]; + face.mIndices = new unsigned int[3]; // Assuming each face has 3 vertices + face.mIndices[0] = static_cast(3 * i); + face.mIndices[1] = static_cast(3 * i + 1); + face.mIndices[2] = static_cast(3 * i + 2); + face.mNumIndices = 3; + + // Calculate normal for this face + aiVector3D edge1 = mesh->mVertices[face.mIndices[1]] - mesh->mVertices[face.mIndices[0]]; + aiVector3D edge2 = mesh->mVertices[face.mIndices[2]] - mesh->mVertices[face.mIndices[0]]; + aiVector3D normal = edge1 ^ edge2; // Cross product + normal.Normalize(); + + // Assign the computed normal to all three vertices of the triangle + mesh->mNormals[face.mIndices[0]] = normal; + mesh->mNormals[face.mIndices[1]] = normal; + mesh->mNormals[face.mIndices[2]] = normal; + } + + // Attach the mesh to the scene + scene->mMeshes = new aiMesh*[1]; + scene->mMeshes[0] = mesh; + scene->mNumMeshes = 1; + + // Create a root node and attach the mesh + scene->mRootNode = new aiNode(); + scene->mRootNode->mMeshes = new unsigned int[1]{0}; + scene->mRootNode->mNumMeshes = 1; + + return scene; +} + +cv::Vec3f rotate3D(const cv::Vec3f& point, const cv::Vec3f& center, const cv::Vec3f& rotation) +{ + // Convert rotation vector to rotation matrix + cv::Matx33f rotationMatrix; + cv::Rodrigues(rotation, rotationMatrix); + + // Subtract center from point + cv::Vec3f translatedPoint = point - center; + + // Rotate the point using the rotation matrix + cv::Vec3f rotatedPoint = rotationMatrix * translatedPoint; + + // Translate the point back + rotatedPoint += center; + + return rotatedPoint; +} + +cv::Matx44f perspective(float fov, float aspect, float zNear, float zFar) { + float tanHalfFovy = tan(fov / 2.0f); + + cv::Matx44f projection = cv::Matx44f::eye(); + projection(0, 0) = 1.0f / (aspect * tanHalfFovy); + projection(1, 1) = 1.0f / (tanHalfFovy); // Invert the y-coordinate + projection(2, 2) = -(zFar + zNear) / (zFar - zNear); // Invert the z-coordinate + projection(2, 3) = -1.0f; + projection(3, 2) = -(2.0f * zFar * zNear) / (zFar - zNear); + projection(3, 3) = 0.0f; + + return projection; +} + +cv::Matx44f lookAt(cv::Vec3f eye, cv::Vec3f center, cv::Vec3f up) { + cv::Vec3f f = cv::normalize(center - eye); + cv::Vec3f s = cv::normalize(f.cross(up)); + cv::Vec3f u = s.cross(f); + + cv::Matx44f view = cv::Matx44f::eye(); + view(0, 0) = s[0]; + view(0, 1) = u[0]; + view(0, 2) = -f[0]; + view(0, 3) = 0.0f; + view(1, 0) = s[1]; + view(1, 1) = u[1]; + view(1, 2) = -f[1]; + view(1, 3) = 0.0f; + view(2, 0) = s[2]; + view(2, 1) = u[2]; + view(2, 2) = -f[2]; + view(2, 3) = 0.0f; + view(3, 0) = -s.dot(eye); + view(3, 1) = -u.dot(eye); + view(3, 2) = f.dot(eye); + view(3, 3) = 1.0f; + + return view; +} + +cv::Matx44f modelView(const cv::Vec3f& translation, const cv::Vec3f& rotationVec, const cv::Vec3f& scaleVec) { + cv::Matx44f scaleMat( + scaleVec[0], 0.0, 0.0, 0.0, + 0.0, scaleVec[1], 0.0, 0.0, + 0.0, 0.0, scaleVec[2], 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotXMat( + 1.0, 0.0, 0.0, 0.0, + 0.0, cos(rotationVec[0]), -sin(rotationVec[0]), 0.0, + 0.0, sin(rotationVec[0]), cos(rotationVec[0]), 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotYMat( + cos(rotationVec[1]), 0.0, sin(rotationVec[1]), 0.0, + 0.0, 1.0, 0.0, 0.0, + -sin(rotationVec[1]), 0.0,cos(rotationVec[1]), 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f rotZMat( + cos(rotationVec[2]), -sin(rotationVec[2]), 0.0, 0.0, + sin(rotationVec[2]), cos(rotationVec[2]), 0.0, 0.0, + 0.0, 0.0, 1.0, 0.0, + 0.0, 0.0, 0.0, 1.0); + + cv::Matx44f translateMat( + 1.0, 0.0, 0.0, 0.0, + 0.0, 1.0, 0.0, 0.0, + 0.0, 0.0, 1.0, 0.0, + translation[0], translation[1], translation[2], 1.0); + + return translateMat * rotXMat * rotYMat * rotZMat * scaleMat; +} + + +static void calculateBoundingBox(const aiMesh* mesh, cv::Vec3f& min, cv::Vec3f& max) { + for (unsigned int i = 0; i < mesh->mNumVertices; ++i) { + aiVector3D vertex = mesh->mVertices[i]; + if (i == 0) { + min = max = cv::Vec3f(vertex.x, vertex.y, vertex.z); + } else { + min[0] = std::min(min[0], vertex.x); + min[1] = std::min(min[1], vertex.y); + min[2] = std::min(min[2], vertex.z); + + max[0] = std::max(max[0], vertex.x); + max[1] = std::max(max[1], vertex.y); + max[2] = std::max(max[2], vertex.z); + } + } +} + +static void calculateBoundingBoxInfo(const aiMesh* mesh, cv::Vec3f& center, cv::Vec3f& size) { + cv::Vec3f min, max; + calculateBoundingBox(mesh, min, max); + center = (min + max) / 2.0f; + size = max - min; +} + +static float calculateAutoScale(const aiMesh* mesh) { + cv::Vec3f center, size; + calculateBoundingBoxInfo(mesh, center, size); + + float maxDimension = std::max(size[0], std::max(size[1], size[2])); + return 1.0f / maxDimension; +} + +static void drawMesh(aiMesh* mesh, Scene::RenderMode mode) { + // Generate and bind VAO + GLuint VAO; + glGenVertexArrays(1, &VAO); + glBindVertexArray(VAO); + + // Load vertex data + GLuint VBO; + glGenBuffers(1, &VBO); + glBindBuffer(GL_ARRAY_BUFFER, VBO); + glBufferData(GL_ARRAY_BUFFER, mesh->mNumVertices * 3 * sizeof(float), mesh->mVertices, GL_STATIC_DRAW); + + // Specify vertex attributes + glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(float), (void*)0); + glEnableVertexAttribArray(0); + + // Load index data, if present + if (mesh->HasFaces()) { + std::vector indices; + for (unsigned int i = 0; i < mesh->mNumFaces; i++) { + aiFace face = mesh->mFaces[i]; + for (unsigned int j = 0; j < face.mNumIndices; j++) + indices.push_back(face.mIndices[j]); + } + + if (mode != Scene::RenderMode::DEFAULT) { + // Duplicate vertices for wireframe rendering or point rendering + std::vector modifiedIndices; + for (size_t i = 0; i < indices.size(); i += 3) { + if (mode == Scene::RenderMode::WIREFRAME) { + // Duplicate vertices for wireframe rendering + modifiedIndices.push_back(indices[i]); + modifiedIndices.push_back(indices[i + 1]); + + modifiedIndices.push_back(indices[i + 1]); + modifiedIndices.push_back(indices[i + 2]); + + modifiedIndices.push_back(indices[i + 2]); + modifiedIndices.push_back(indices[i]); + } + + if (mode == Scene::RenderMode::POINTCLOUD) { + // Duplicate vertices for point rendering + modifiedIndices.push_back(indices[i]); + modifiedIndices.push_back(indices[i + 1]); + modifiedIndices.push_back(indices[i + 2]); + } + } + + GLuint EBO; + glGenBuffers(1, &EBO); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, EBO); + glBufferData(GL_ELEMENT_ARRAY_BUFFER, modifiedIndices.size() * sizeof(unsigned int), &modifiedIndices[0], GL_STATIC_DRAW); + + // Draw as lines or points + if (mode == Scene::RenderMode::WIREFRAME) { + glDrawElements(GL_LINES, modifiedIndices.size(), GL_UNSIGNED_INT, 0); + } else if (mode == Scene::RenderMode::POINTCLOUD) { + glDrawElements(GL_POINTS, modifiedIndices.size(), GL_UNSIGNED_INT, 0); + } + + // Cleanup + glDeleteBuffers(1, &EBO); + } else { + GLuint EBO; + glGenBuffers(1, &EBO); + glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, EBO); + glBufferData(GL_ELEMENT_ARRAY_BUFFER, indices.size() * sizeof(unsigned int), &indices[0], GL_STATIC_DRAW); + + // Draw as triangles + glDrawElements(GL_TRIANGLES, indices.size(), GL_UNSIGNED_INT, 0); + + // Cleanup + glDeleteBuffers(1, &EBO); + } + } else { + glDrawArrays(GL_TRIANGLES, 0, mesh->mNumVertices); + } + + // Cleanup + glDeleteVertexArrays(1, &VAO); + glDeleteBuffers(1, &VBO); +} + +// Function to recursively draw a node and its children +static void drawNode(aiNode* node, const aiScene* scene, Scene::RenderMode mode) { + // Draw all meshes at this node + for(unsigned int i = 0; i < node->mNumMeshes; i++) { + aiMesh* mesh = scene->mMeshes[node->mMeshes[i]]; + drawMesh(mesh, mode); + } + + // Recurse for all children + for(unsigned int i = 0; i < node->mNumChildren; i++) { + drawNode(node->mChildren[i], scene, mode); + } +} + +// Function to draw a model +static void drawModel(const aiScene* scene, Scene::RenderMode mode) { + // Draw the root node + drawNode(scene->mRootNode, scene, mode); +} + +static void applyModelView(cv::Mat_& points, const cv::Matx44f& transformation) { + // Ensure the input points matrix has the correct dimensions (3 columns for x, y, z) + CV_Assert(points.cols == 3); + + // Construct the 4x4 transformation matrix with scaling + + + // Convert points to homogeneous coordinates (add a column of ones) + cv::hconcat(points, cv::Mat::ones(points.rows, 1, CV_32F), points); + + // Transpose the points matrix for multiplication + cv::Mat pointsTransposed = points.t(); + + // Apply the transformation + cv::Mat transformedPoints = transformation * pointsTransposed; + + // Transpose back to the original orientation + transformedPoints = transformedPoints.t(); + + // Extract the transformed 3D points (excluding the fourth homogeneous coordinate) + points = transformedPoints(cv::Rect(0, 0, 3, transformedPoints.rows)).clone(); +} + +static void applyModelView(std::vector& points, const cv::Matx44f& transformation) { + // Ensure the input points vector is not empty + if (points.empty()) { + std::cerr << "Error: Input points vector is empty.\n"; + return; + } + + // Apply the model-view transformation to each point + for (auto& point : points) { + // Convert the point to a column vector + cv::Mat pointMat = (cv::Mat_(3, 1) << point.x, point.y, point.z); + + pointMat = transformation * pointMat; + + // Update the point with the transformed values + point = cv::Point3f(pointMat.at(0, 0), pointMat.at(1, 0), pointMat.at(2, 0)); + } +} + +static void processNode(const aiNode* node, const aiScene* scene, cv::Mat_& allVertices) { + // Process all meshes in the current node + for (unsigned int i = 0; i < node->mNumMeshes; ++i) { + const aiMesh* mesh = scene->mMeshes[node->mMeshes[i]]; + + // Process all vertices in the current mesh + for (unsigned int j = 0; j < mesh->mNumVertices; ++j) { + aiVector3D aiVertex = mesh->mVertices[j]; + cv::Mat_ vertex = (cv::Mat_(1, 3) << aiVertex.x, aiVertex.y, aiVertex.z); + allVertices.push_back(vertex); + } + } + + // Recursively process child nodes + for (unsigned int i = 0; i < node->mNumChildren; ++i) { + processNode(node->mChildren[i], scene, allVertices); + } +} + +static void processNode(const aiNode* node, const aiScene* scene, std::vector& allVertices) { + // Process all meshes in the current node + for (unsigned int i = 0; i < node->mNumMeshes; ++i) { + const aiMesh* mesh = scene->mMeshes[node->mMeshes[i]]; + + // Process all vertices in the current mesh + for (unsigned int j = 0; j < mesh->mNumVertices; ++j) { + aiVector3D aiVertex = mesh->mVertices[j]; + cv::Point3f vertex(aiVertex.x, aiVertex.y, aiVertex.z); + allVertices.push_back(vertex); + } + } + + // Recursively process child nodes + for (unsigned int i = 0; i < node->mNumChildren; ++i) { + processNode(node->mChildren[i], scene, allVertices); + } +} + +Scene::Scene() { +} + +Scene::~Scene() { +} + +void Scene::reset() { + if(shaderHandles_[0] > 0) + glDeleteProgram(shaderHandles_[0]); + if(shaderHandles_[1] > 0) + glDeleteShader(shaderHandles_[1]); + if(shaderHandles_[2] > 0) + glDeleteShader(shaderHandles_[2]); + //FIXME how to cleanup a scene? +// releaseAssimpScene(scene_); +} + +bool Scene::load(const std::vector& points) { + reset(); + std::vector copy = points; + scene_ = createAssimpScene(copy); + cv::v4d::initShader(shaderHandles_, vertexShaderSource_.c_str(), fragmentShaderSource_.c_str(), "fragColor"); + calculateBoundingBoxInfo(scene_->mMeshes[0], autoCenter_, size_); + autoScale_ = calculateAutoScale(scene_->mMeshes[0]); + return true; +} + + +bool Scene::load(const std::string& filename) { + reset(); + scene_ = importer_.ReadFile(filename, aiProcess_Triangulate | aiProcess_GenNormals); + + if (!scene_ || (scene_->mFlags & AI_SCENE_FLAGS_INCOMPLETE) || !scene_->mRootNode) { + return false; + } + + + cv::v4d::initShader(shaderHandles_, vertexShaderSource_.c_str(), fragmentShaderSource_.c_str(), "fragColor"); + calculateBoundingBoxInfo(scene_->mMeshes[0], autoCenter_, size_); + autoScale_ = calculateAutoScale(scene_->mMeshes[0]); + return true; +} + +cv::Mat_ Scene::pointCloudAsMat() { + cv::Mat_ allVertices; + processNode(scene_->mRootNode, scene_, allVertices); + return allVertices; +} + +std::vector Scene::pointCloudAsVector() { + std::vector allVertices; + processNode(scene_->mRootNode, scene_, allVertices); + return allVertices; +} + +void Scene::render(const cv::Rect& viewport, const cv::Matx44f& projection, const cv::Matx44f& view, const cv::Matx44f& modelView) { + glViewport(viewport.x, viewport.y, viewport.width, viewport.height); + glEnable(GL_DEPTH_TEST); + glEnable(GL_VERTEX_PROGRAM_POINT_SIZE); + glUniformMatrix4fv(glGetUniformLocation(shaderHandles_[0], "projection"), 1, GL_FALSE, projection.val); + glUniformMatrix4fv(glGetUniformLocation(shaderHandles_[0], "view"), 1, GL_FALSE, view.val); + glUniform3fv(glGetUniformLocation(shaderHandles_[0], "lightPos"), 1, lightPos_.val); + glUniform3fv(glGetUniformLocation(shaderHandles_[0], "viewPos"), 1, viewPos_.val); + glUniform1i(glGetUniformLocation(shaderHandles_[0], "renderMode"), mode_); + glUniformMatrix4fv(glGetUniformLocation(shaderHandles_[0], "model"), 1, GL_FALSE, modelView.val); + glUseProgram(shaderHandles_[0]); + + drawModel(scene_, mode_); +} + +} /* namespace gl */ +} /* namespace v4d */ +} /* namespace cv */ diff --git a/modules/v4d/src/sink.cpp b/modules/v4d/src/sink.cpp new file mode 100644 index 000000000..0918aaa16 --- /dev/null +++ b/modules/v4d/src/sink.cpp @@ -0,0 +1,60 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/sink.hpp" +#include + +namespace cv { +namespace v4d { + +Sink::Sink(std::function consumer) : + consumer_(consumer) { +} + +Sink::Sink() { + +} +Sink::~Sink() { +} + +bool Sink::isReady() { + std::lock_guard lock(mtx_); + if (consumer_) + return true; + else + return false; +} + +bool Sink::isOpen() { + std::lock_guard lock(mtx_); + return open_; +} + +void Sink::operator()(const uint64_t& seq, const cv::UMat& frame) { + std::lock_guard lock(mtx_); + if(seq == nextSeq_) { + uint64_t currentSeq = seq; + cv::UMat currentFrame = frame; + buffer_[seq] = frame; + do { + open_ = consumer_(currentSeq, currentFrame); + ++nextSeq_; + buffer_.erase(buffer_.begin()); + if(buffer_.empty()) + break; + auto pair = (*buffer_.begin()); + currentSeq = pair.first; + currentFrame = pair.second; + } while(currentSeq == nextSeq_); + } else { + buffer_[seq] = frame; + } + if(buffer_.size() > 240) { + CV_LOG_WARNING(nullptr, "Buffer overrun in sink."); + buffer_.clear(); + } +} +} /* namespace v4d */ +} /* namespace kb */ diff --git a/modules/v4d/src/source.cpp b/modules/v4d/src/source.cpp new file mode 100644 index 000000000..c63e099e6 --- /dev/null +++ b/modules/v4d/src/source.cpp @@ -0,0 +1,45 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/source.hpp" + +namespace cv { +namespace v4d { + +Source::Source(std::function generator, float fps) : + generator_(generator), fps_(fps) { +} + +Source::Source() : + open_(false), fps_(0) { +} + +Source::~Source() { +} + +bool Source::isOpen() { + std::lock_guard guard(mtx_); + return generator_ && open_; +} + +float Source::fps() { + return fps_; +} + +std::pair Source::operator()() { + std::lock_guard guard(mtx_); + static thread_local cv::UMat frame; + if(threadSafe_) { + static std::mutex mtx_; + std::unique_lock lock(mtx_); + open_ = generator_(frame); + return {count_++, frame}; + } else { + open_ = generator_(frame); + return {count_++, frame}; + } +} +} /* namespace v4d */ +} /* namespace kb */ diff --git a/modules/v4d/src/util.cpp b/modules/v4d/src/util.cpp new file mode 100644 index 000000000..2d2e3c6db --- /dev/null +++ b/modules/v4d/src/util.cpp @@ -0,0 +1,432 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include +#include +#include + +#include "../include/opencv2/v4d/v4d.hpp" +#include "../include/opencv2/v4d/util.hpp" + +#include +#include +#include +#include +#include +#include +#include + +using std::cerr; +using std::endl; + +namespace cv { +namespace v4d { +namespace detail { + +#ifdef __GNUG__ +std::string demangle(const char* name) { + int status = -4; // some arbitrary value to eliminate the compiler warning + std::unique_ptr res { + abi::__cxa_demangle(name, NULL, NULL, &status), + std::free + }; + + return (status==0) ? res.get() : name ; +} + +#else +// does nothing if not g++ +std::string demangle(const char* name) { + return name; +} +#endif + +size_t cnz(const cv::UMat& m) { + cv::UMat grey; + if(m.channels() == 1) { + grey = m; + } else if(m.channels() == 3) { + cvtColor(m, grey, cv::COLOR_BGR2GRAY); + } else if(m.channels() == 4) { + cvtColor(m, grey, cv::COLOR_BGRA2GRAY); + } else { + assert(false); + } + return cv::countNonZero(grey); +} +} + +CV_EXPORTS void copy_shared(const cv::UMat& src, cv::UMat& dst) { + if(dst.empty()) + dst.create(src.size(), src.type()); + src.copyTo(dst.getMat(cv::ACCESS_READ)); +} + +cv::Scalar colorConvert(const cv::Scalar& src, cv::ColorConversionCodes code) { + cv::Mat tmpIn(1, 1, CV_8UC3); + cv::Mat tmpOut(1, 1, CV_8UC3); + + tmpIn.at(0, 0) = cv::Vec3b(src[0], src[1], src[2]); + cvtColor(tmpIn, tmpOut, code); + const cv::Vec3b& vdst = tmpOut.at(0, 0); + cv::Scalar dst(vdst[0], vdst[1], vdst[2], src[3]); + return dst; +} + +void gl_check_error(const std::filesystem::path& file, unsigned int line, const char* expression) { + int errorCode = glGetError(); +// cerr << "TRACE: " << file.filename() << " (" << line << ") : " << expression << " => code: " << errorCode << endl; + if (errorCode != 0) { + std::stringstream ss; + ss << "GL failed in " << file.filename() << " (" << line << ") : " << "\nExpression:\n " + << expression << "\nError code:\n " << errorCode; + CV_LOG_WARNING(nullptr, ss.str()); + } +} + +void initShader(unsigned int handles[3], const char* vShader, const char* fShader, const char* outputAttributeName) { + struct Shader { + GLenum type; + const char* source; + } shaders[2] = { { GL_VERTEX_SHADER, vShader }, { GL_FRAGMENT_SHADER, fShader } }; + + GLuint program = glCreateProgram(); + handles[0] = program; + + for (int i = 0; i < 2; ++i) { + Shader& s = shaders[i]; + GLuint shader = glCreateShader(s.type); + handles[i + 1] = shader; + glShaderSource(shader, 1, (const GLchar**) &s.source, NULL); + glCompileShader(shader); + + GLint compiled; + glGetShaderiv(shader, GL_COMPILE_STATUS, &compiled); + if (!compiled) { + std::cerr << " failed to compile:" << std::endl; + GLint logSize; + glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &logSize); + char* logMsg = new char[logSize]; + glGetShaderInfoLog(shader, logSize, NULL, logMsg); + std::cerr << logMsg << std::endl; + delete[] logMsg; + + exit (EXIT_FAILURE); + } + + glAttachShader(program, shader); + } +#if !defined(OPENCV_V4D_USE_ES3) + /* Link output */ + glBindFragDataLocation(program, 0, outputAttributeName); +#else + CV_UNUSED(outputAttributeName); +#endif + /* link and error check */ + glLinkProgram(program); + + GLint linked; + glGetProgramiv(program, GL_LINK_STATUS, &linked); + if (!linked) { + std::cerr << "Shader program failed to link" << std::endl; + GLint logSize; + glGetProgramiv(program, GL_INFO_LOG_LENGTH, &logSize); + char* logMsg = new char[logSize]; + glGetProgramInfoLog(program, logSize, NULL, logMsg); + std::cerr << logMsg << std::endl; + delete[] logMsg; + + exit (EXIT_FAILURE); + } + +} + +std::string getGlVendor() { + std::ostringstream oss; + oss << reinterpret_cast(glGetString(GL_VENDOR)); + return oss.str(); +} + +std::string getGlInfo() { + std::ostringstream oss; + oss << "\n\t" << reinterpret_cast(glGetString(GL_VERSION)) + << "\n\t" << reinterpret_cast(glGetString(GL_RENDERER)) << endl; + return oss.str(); +} + +std::string getClInfo() { + std::stringstream ss; +#ifdef HAVE_OPENCL + if(cv::ocl::useOpenCL()) { + std::vector plt_info; + cv::ocl::getPlatfomsInfo(plt_info); + const cv::ocl::Device& defaultDevice = cv::ocl::Device::getDefault(); + cv::ocl::Device current; + ss << endl; + for (const auto& info : plt_info) { + for (int i = 0; i < info.deviceNumber(); ++i) { + ss << "\t"; + info.getDevice(current, i); + if (defaultDevice.name() == current.name()) + ss << "* "; + else + ss << " "; + ss << info.version() << " = " << info.name() << endl; + ss << "\t\t GL sharing: " + << (current.isExtensionSupported("cl_khr_gl_sharing") ? "true" : "false") + << endl; + ss << "\t\t VAAPI media sharing: " + << (current.isExtensionSupported("cl_intel_va_api_media_sharing") ? + "true" : "false") << endl; + } + } + } +#endif + return ss.str(); +} + +bool isIntelVaSupported() { +#ifdef HAVE_OPENCL + if(cv::ocl::useOpenCL()) { + try { + std::vector plt_info; + cv::ocl::getPlatfomsInfo(plt_info); + cv::ocl::Device current; + for (const auto& info : plt_info) { + for (int i = 0; i < info.deviceNumber(); ++i) { + info.getDevice(current, i); + return current.isExtensionSupported("cl_intel_va_api_media_sharing"); + } + } + } catch (std::exception& ex) { + cerr << "Intel VAAPI query failed: " << ex.what() << endl; + } catch (...) { + cerr << "Intel VAAPI query failed" << endl; + } + } +#endif + return false; +} + +bool isClGlSharingSupported() { +#ifdef HAVE_OPENCL + if(cv::ocl::useOpenCL()) { + try { + if(!cv::ocl::useOpenCL()) + return false; + std::vector plt_info; + cv::ocl::getPlatfomsInfo(plt_info); + cv::ocl::Device current; + for (const auto& info : plt_info) { + for (int i = 0; i < info.deviceNumber(); ++i) { + info.getDevice(current, i); + return current.isExtensionSupported("cl_khr_gl_sharing"); + } + } + } catch (std::exception& ex) { + cerr << "CL-GL sharing query failed: " << ex.what() << endl; + } catch (...) { + cerr << "CL-GL sharing query failed with unknown error." << endl; + } + } +#endif + return false; +} +static std::mutex finish_mtx; +/*! + * Internal variable that signals that finishing all operation is requested + */ +static bool finish_requested = false; +/*! + * Internal variable that tracks if signal handlers have already been installed + */ +static bool signal_handlers_installed = false; + +/*! + * Signal handler callback that signals the application to terminate. + * @param ignore We ignore the signal number + */ +static void request_finish(int ignore) { + std::lock_guard guard(finish_mtx); + CV_UNUSED(ignore); + finish_requested = true; +} + +/*! + * Installs #request_finish() as signal handler for SIGINT and SIGTERM + */ +static void install_signal_handlers() { + signal(SIGINT, request_finish); + signal(SIGTERM, request_finish); +} + +bool keepRunning() { + std::lock_guard guard(finish_mtx); + if (!signal_handlers_installed) { + install_signal_handlers(); + } + return !finish_requested; +} + +void requestFinish() { + request_finish(0); +} + +cv::Ptr makeVaSink(cv::Ptr window, const string& outputFilename, const int fourcc, const float fps, + const cv::Size& frameSize, const int vaDeviceIndex) { + cv::Ptr writer = new cv::VideoWriter(outputFilename, cv::CAP_FFMPEG, + fourcc, fps, frameSize, { + cv::VIDEOWRITER_PROP_HW_DEVICE, vaDeviceIndex, + cv::VIDEOWRITER_PROP_HW_ACCELERATION, cv::VIDEO_ACCELERATION_VAAPI, + cv::VIDEOWRITER_PROP_HW_ACCELERATION_USE_OPENCL, 1 }); + if(isIntelVaSupported()) + window->sourceCtx()->copyContext(); + + cerr << "Using a VA sink" << endl; + if(writer->isOpened()) { + return new Sink([=](const uint64_t& seq, const cv::UMat& frame) { + CV_UNUSED(seq); + CLExecScope_t scope(window->sourceCtx()->getCLExecContext()); + //FIXME cache it + cv::UMat converted; + cv::resize(frame, converted, frameSize); + cvtColor(converted, converted, cv::COLOR_BGRA2RGB); + (*writer) << converted; + return writer->isOpened(); + }); + } else { + return new Sink(); + } +} + +cv::Ptr makeVaSource(cv::Ptr window, const string& inputFilename, const int vaDeviceIndex) { + cv::Ptr capture = new cv::VideoCapture(inputFilename, cv::CAP_FFMPEG, { + cv::CAP_PROP_HW_DEVICE, vaDeviceIndex, cv::CAP_PROP_HW_ACCELERATION, + cv::VIDEO_ACCELERATION_VAAPI, cv::CAP_PROP_HW_ACCELERATION_USE_OPENCL, 1 }); + float fps = capture->get(cv::CAP_PROP_FPS); + cerr << "Using a VA source" << endl; + if(isIntelVaSupported()) + window->sourceCtx()->copyContext(); + + return new Source([=](cv::UMat& frame) { + CLExecScope_t scope(window->sourceCtx()->getCLExecContext()); + (*capture) >> frame; + return !frame.empty(); + }, fps); +} + +static cv::Ptr makeAnyHWSink(const string& outputFilename, const int fourcc, const float fps, + const cv::Size& frameSize) { + cv::Ptr writer = new cv::VideoWriter(outputFilename, cv::CAP_FFMPEG, + fourcc, fps, frameSize, { cv::VIDEOWRITER_PROP_HW_ACCELERATION, cv::VIDEO_ACCELERATION_ANY }); + + if(writer->isOpened()) { + return new Sink([=](const uint64_t& seq, const cv::UMat& frame) { + CV_UNUSED(seq); + cv::UMat converted; + cv::UMat context_corrected; + frame.copyTo(context_corrected); + cv::resize(context_corrected, converted, frameSize); + cvtColor(converted, converted, cv::COLOR_BGRA2RGB); + (*writer) << converted; + return writer->isOpened(); + }); + } else { + return new Sink(); + } +} + +static cv::Ptr makeAnyHWSource(const string& inputFilename) { + cv::Ptr capture = new cv::VideoCapture(inputFilename, cv::CAP_FFMPEG, { + cv::CAP_PROP_HW_ACCELERATION, cv::VIDEO_ACCELERATION_ANY }); + float fps = capture->get(cv::CAP_PROP_FPS); + + return new Source([=](cv::UMat& frame) { + (*capture) >> frame; + return !frame.empty(); + }, fps); +} + +cv::Ptr makeWriterSink(cv::Ptr window, const string& outputFilename, const float fps, const cv::Size& frameSize) { + int fourcc = 0; + //FIXME find a cleverer way to guess a decent codec + if(getGlVendor() == "NVIDIA Corporation") { + fourcc = cv::VideoWriter::fourcc('H', '2', '6', '4'); + } else { + fourcc = cv::VideoWriter::fourcc('V', 'P', '9', '0'); + } + return makeWriterSink(window, outputFilename, fps, frameSize, fourcc); +} + +cv::Ptr makeWriterSink(cv::Ptr window, const string& outputFilename, const float fps, + const cv::Size& frameSize, int fourcc) { + if (isIntelVaSupported()) { + return makeVaSink(window, outputFilename, fourcc, fps, frameSize, 0); + } else { + try { + return makeAnyHWSink(outputFilename, fourcc, fps, frameSize); + } catch(...) { + cerr << "Failed creating hardware source" << endl; + } + } + + cv::Ptr writer = new cv::VideoWriter(outputFilename, cv::CAP_FFMPEG, + fourcc, fps, frameSize); + + if(writer->isOpened()) { + return new Sink([=](const uint64_t& seq, const cv::UMat& frame) { + CV_UNUSED(seq); + cv::UMat converted; + cv::resize(frame, converted, frameSize); + cvtColor(converted, converted, cv::COLOR_BGRA2RGB); + (*writer) << converted; + return writer->isOpened(); + }); + } else { + return new Sink(); + } +} + +cv::Ptr makeCaptureSource(cv::Ptr window, const string& inputFilename) { + if (isIntelVaSupported()) { + return makeVaSource(window, inputFilename, 0); + } else { + try { + return makeAnyHWSource(inputFilename); + } catch(...) { + cerr << "Failed creating hardware source" << endl; + } + } + + cv::Ptr capture = new cv::VideoCapture(inputFilename, cv::CAP_FFMPEG); + float fps = capture->get(cv::CAP_PROP_FPS); + + return new Source([=](cv::UMat& frame) { + (*capture) >> frame; + return !frame.empty(); + }, fps); +} + +void resizePreserveAspectRatio(const cv::UMat& src, cv::UMat& output, const cv::Size& dstSize, const cv::Scalar& bgcolor) { + cv::UMat tmp; + double hf = double(dstSize.height) / src.size().height; + double wf = double(dstSize.width) / src.size().width; + double f = std::min(hf, wf); + if (f < 0) + f = 1.0 / f; + + cv::resize(src, tmp, cv::Size(), f, f); + + int top = (dstSize.height - tmp.rows) / 2; + int down = (dstSize.height - tmp.rows + 1) / 2; + int left = (dstSize.width - tmp.cols) / 2; + int right = (dstSize.width - tmp.cols + 1) / 2; + + cv::copyMakeBorder(tmp, output, top, down, left, right, cv::BORDER_CONSTANT, bgcolor); +} + +} +} + diff --git a/modules/v4d/src/v4d.cpp b/modules/v4d/src/v4d.cpp new file mode 100644 index 000000000..b9831e04c --- /dev/null +++ b/modules/v4d/src/v4d.cpp @@ -0,0 +1,453 @@ +// This file is part of OpenCV project. +// It is subject to the license terms in the LICENSE file found in the top-level directory +// of this distribution and at http://opencv.org/license.html. +// Copyright Amir Hassan (kallaballa) + +#include "opencv2/v4d/v4d.hpp" +#include "opencv2/v4d/detail/framebuffercontext.hpp" +#include +#include +#include +#include + +namespace cv { +namespace v4d { + +cv::Ptr V4D::make(const cv::Size& size, const string& title, AllocateFlags flags, bool offscreen, bool debug, int samples) { + V4D* v4d = new V4D(size, cv::Size(), title, flags, offscreen, debug, samples); + v4d->setVisible(!offscreen); + v4d->fbCtx()->makeCurrent(); + return v4d->self(); +} + +cv::Ptr V4D::make(const cv::Size& size, const cv::Size& fbsize, const string& title, AllocateFlags flags, bool offscreen, bool debug, int samples) { + V4D* v4d = new V4D(size, fbsize, title, flags, offscreen, debug, samples); + v4d->setVisible(!offscreen); + v4d->fbCtx()->makeCurrent(); + return v4d->self(); +} + +cv::Ptr V4D::make(const V4D& other, const string& title) { + V4D* v4d = new V4D(other, title); + v4d->setVisible(other.debug_); + v4d->fbCtx()->makeCurrent(); + return v4d->self(); +} + +V4D::V4D(const cv::Size& size, const cv::Size& fbsize, const string& title, AllocateFlags aflags, bool offscreen, bool debug, int samples) : + initialSize_(size), flags_(aflags), debug_(debug), viewport_(0, 0, size.width, size.height), stretching_(true), samples_(samples) { + self_ = cv::Ptr(this); + mainFbContext_ = new detail::FrameBufferContext(*this, fbsize.empty() ? size : fbsize, offscreen, title, 3, + 2, samples, debug, nullptr, nullptr, true); + CLExecScope_t scope(mainFbContext_->getCLExecContext()); + if(flags() & NANOVG) + nvgContext_ = new detail::NanoVGContext(mainFbContext_); + sourceContext_ = new detail::SourceContext(mainFbContext_); + sinkContext_ = new detail::SinkContext(mainFbContext_); + onceContext_ = new detail::OnceContext(); + plainContext_ = new detail::PlainContext(); + if(flags() & IMGUI) + imguiContext_ = new detail::ImGuiContextImpl(mainFbContext_); + + //preallocate the primary gl context + glCtx(-1); +} + +V4D::V4D(const V4D& other, const string& title) : + initialSize_(other.initialSize_), flags_(other.flags_), debug_(other.debug_), viewport_(0, 0, other.fbSize().width, other.fbSize().height), stretching_(other.stretching_), samples_(other.samples_) { + workerIdx_ = Global::next_worker_idx(); + self_ = cv::Ptr(this); + mainFbContext_ = new detail::FrameBufferContext(*this, other.fbSize(), !other.debug_, title, 3, + 2, other.samples_, other.debug_, other.fbCtx()->rootWindow_, other.fbCtx(), true); + + CLExecScope_t scope(mainFbContext_->getCLExecContext()); + if(flags() & NANOVG) + nvgContext_ = new detail::NanoVGContext(mainFbContext_); + sourceContext_ = new detail::SourceContext(mainFbContext_); + sinkContext_ = new detail::SinkContext(mainFbContext_); + onceContext_ = new detail::OnceContext(); + plainContext_ = new detail::PlainContext(); + + //preallocate the primary gl context + glCtx(-1); +} + +V4D::~V4D() { + +} + +const int32_t& V4D::workerIndex() const { + return workerIdx_; +} + +size_t V4D::workers_running() { + return Global::workers_started(); +} + +cv::ogl::Texture2D& V4D::texture() { + return mainFbContext_->getTexture2D(); +} + +std::string V4D::title() const { + return fbCtx()->title_; +} + +cv::Point2f V4D::getMousePosition() { + return mousePos_; +} + +void V4D::setMousePosition(const cv::Point2f& pt) { + mousePos_ = pt; +} + +cv::Ptr V4D::fbCtx() const { + assert(mainFbContext_ != nullptr); + return mainFbContext_; +} + +cv::Ptr V4D::sourceCtx() { + assert(sourceContext_ != nullptr); + return sourceContext_; +} + +cv::Ptr V4D::sinkCtx() { + assert(sinkContext_ != nullptr); + return sinkContext_; +} + +cv::Ptr V4D::nvgCtx() { + assert(nvgContext_ != nullptr); + return nvgContext_; +} + +cv::Ptr V4D::onceCtx() { + assert(onceContext_ != nullptr); + return onceContext_; +} + +cv::Ptr V4D::plainCtx() { + assert(plainContext_ != nullptr); + return plainContext_; +} + +cv::Ptr V4D::imguiCtx() { + assert(imguiContext_ != nullptr); + return imguiContext_; +} + +cv::Ptr V4D::glCtx(int32_t idx) { + auto it = glContexts_.find(idx); + if(it != glContexts_.end()) + return (*it).second; + else { + cv::Ptr ctx = new GLContext(idx, mainFbContext_); + glContexts_.insert({idx, ctx}); + return ctx; + } +} + +bool V4D::hasFbCtx() { + return mainFbContext_ != nullptr; +} + +bool V4D::hasSourceCtx() { + return sourceContext_ != nullptr; +} + +bool V4D::hasSinkCtx() { + return sinkContext_ != nullptr; +} + +bool V4D::hasNvgCtx() { + return nvgContext_ != nullptr; +} + +bool V4D::hasOnceCtx() { + return onceContext_ != nullptr; +} + +bool V4D::hasParallelCtx() { + return plainContext_ != nullptr; +} + +bool V4D::hasImguiCtx() { + return imguiContext_ != nullptr; +} + +bool V4D::hasGlCtx(uint32_t idx) { + return glContexts_.find(idx) != glContexts_.end(); +} + +size_t V4D::numGlCtx() { + return std::max(off_t(0), off_t(glContexts_.size()) - 1); +} + +void V4D::copyTo(cv::UMat& m) { + fbCtx()->copyTo(m); +} + +void V4D::copyFrom(const cv::UMat& m) { + fbCtx()->copyFrom(m); +} + +void V4D::setSource(cv::Ptr src) { + source_ = src; +} +cv::Ptr V4D::getSource() { + return source_; +} + +bool V4D::hasSource() { + return source_ != nullptr; +} + +void V4D::feed(cv::UMat& in) { + static thread_local cv::UMat frame; + + plain([](cv::UMat& src, cv::UMat& f, const cv::Size sz) { + cv::UMat rgb; + + resizePreserveAspectRatio(src, rgb, sz); + cv::cvtColor(rgb, f, cv::COLOR_RGB2BGRA); + }, in, frame, mainFbContext_->size()); + + fb([](cv::UMat& frameBuffer, const cv::UMat& f) { + f.copyTo(frameBuffer); + }, frame); +} + +cv::UMat V4D::fetch() { + static thread_local cv::UMat frame; + fb([](const cv::UMat& fb, cv::UMat& f) { + fb.copyTo(f); + }, frame); + return frame; +} + + +void V4D::setSink(cv::Ptr sink) { + sink_ = sink; +} + +cv::Ptr V4D::getSink() { + return sink_; +} + +bool V4D::hasSink() { + return sink_ != nullptr; +} + +cv::Vec2f V4D::position() { + return fbCtx()->position(); +} + +cv::Rect& V4D::viewport() { + return viewport_; +} + +float V4D::pixelRatioX() { + return fbCtx()->pixelRatioX(); +} + +float V4D::pixelRatioY() { + return fbCtx()->pixelRatioY(); +} + +const cv::Size& V4D::fbSize() const { + return fbCtx()->size(); +} + +const cv::Size& V4D::initialSize() const { + return initialSize_; +} + +cv::Size V4D::size() { + return fbCtx()->getWindowSize(); +} + +void V4D::setSize(const cv::Size& sz) { + fbCtx()->setWindowSize(sz); +} + +void V4D::setShowFPS(bool s) { + showFPS_ = s; +} + +bool V4D::getShowFPS() { + return showFPS_; +} + +void V4D::setPrintFPS(bool p) { + printFPS_ = p; +} + +bool V4D::getPrintFPS() { + return printFPS_; +} + +void V4D::setShowTracking(bool st) { + showTracking_ = st; +} + +void V4D::setDisableIO(bool d) { + disableIO_ = d; +} + +bool V4D::getShowTracking() { + return showTracking_; +} + +bool V4D::isFullscreen() { + return fbCtx()->isFullscreen(); +} + +void V4D::setFullscreen(bool f) { + fbCtx()->setFullscreen(f); +} + +bool V4D::isResizable() { + return fbCtx()->isResizable(); +} + +void V4D::setResizable(bool r) { + fbCtx()->setResizable(r); +} + +bool V4D::isVisible() { + return fbCtx()->isVisible(); +} + +void V4D::setVisible(bool v) { + fbCtx()->setVisible(v); +} + +void V4D::setStretching(bool s) { + stretching_ = s; +} + +bool V4D::isStretching() { + return stretching_; +} + +void V4D::setFocused(bool f) { + focused_ = f; +} + +bool V4D::isFocused() { + return focused_; +} + +void V4D::swapContextBuffers() { + { + FrameBufferContext::GLScope glScope(glCtx(-1)->fbCtx(), GL_READ_FRAMEBUFFER); + glCtx(-1)->fbCtx()->blitFrameBufferToFrameBuffer(viewport(), glCtx(-1)->fbCtx()->getWindowSize(), 0, isStretching()); +// GL_CHECK(glFinish()); + glfwSwapBuffers(glCtx(-1)->fbCtx()->getGLFWWindow()); + } + + for(size_t i = 0; i < numGlCtx(); ++i) { + FrameBufferContext::GLScope glScope(glCtx(i)->fbCtx(), GL_READ_FRAMEBUFFER); + glCtx(i)->fbCtx()->blitFrameBufferToFrameBuffer(viewport(), glCtx(i)->fbCtx()->getWindowSize(), 0, isStretching()); +// GL_CHECK(glFinish()); + glfwSwapBuffers(glCtx(i)->fbCtx()->getGLFWWindow()); + } + + if(hasNvgCtx()) { + FrameBufferContext::GLScope glScope(nvgCtx()->fbCtx(), GL_READ_FRAMEBUFFER); + nvgCtx()->fbCtx()->blitFrameBufferToFrameBuffer(viewport(), nvgCtx()->fbCtx()->getWindowSize(), 0, isStretching()); +// GL_CHECK(glFinish()); + glfwSwapBuffers(nvgCtx()->fbCtx()->getGLFWWindow()); + } +} + +bool V4D::display() { + bool result = true; + if(!Global::is_main()) + Global::next_frame_cnt(); + + if(debug_) { + swapContextBuffers(); + } + if (Global::is_main()) { + auto start = Global::start_time(); + auto now = get_epoch_nanos(); + auto diff = now - start; + double diffSeconds = diff / 1000000000.0; + + if(Global::fps() > 0 && diffSeconds > 1.0) { + Global::add_to_start_time(diff / 2.0); + Global::mul_frame_cnt(0.5); + } else { + Global::set_fps((Global::fps() * 3.0 + (Global::frame_cnt() / diffSeconds)) / 4.0); + } + + if(getPrintFPS()) + cerr << "\rFPS:" << Global::fps() << endl; + { + FrameBufferContext::GLScope glScope(fbCtx(), GL_READ_FRAMEBUFFER); + fbCtx()->blitFrameBufferToFrameBuffer(viewport(), fbCtx()->getWindowSize(), 0, isStretching()); + } + + if(hasImguiCtx()) + imguiCtx()->render(getShowFPS()); + + glfwSwapBuffers(fbCtx()->getGLFWWindow()); + glfwPollEvents(); + } else { + fbCtx()->copyToRootWindow(); + } + + result = !glfwWindowShouldClose(getGLFWWindow()); + +//FIXME doesn't have any effect +// GL_CHECK(glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0)); +//#if !defined(OPENCV_V4D_USE_ES3) +// GL_CHECK(glDrawBuffer(GL_BACK)); +//#endif +// GL_CHECK(glViewport(0, 0, size().width, size().height)); +// GL_CHECK(glClearColor(1,0,0,255)); +// GL_CHECK(glClear(GL_COLOR_BUFFER_BIT)); + + if (frameCnt_ == (std::numeric_limits().max() - 1)) + frameCnt_ = 0; + else + ++frameCnt_; + + return result; +} + +const uint64_t& V4D::frameCount() const { + return frameCnt_; +} + +bool V4D::isClosed() { + return fbCtx()->isClosed(); +} + +void V4D::close() { + fbCtx()->close(); +} + +GLFWwindow* V4D::getGLFWWindow() const { + return fbCtx()->getGLFWWindow(); +} + +void V4D::printSystemInfo() { + cerr << "OpenGL: " << getGlInfo() << endl; +#ifdef HAVE_OPENCL + if(cv::ocl::useOpenCL()) + cerr << "OpenCL Platforms: " << getClInfo() << endl; +#endif +} + +AllocateFlags V4D::flags() { + return flags_; +} + +cv::Ptr V4D::self() { + return self_; +} + + +} +} diff --git a/modules/v4d/third/bgfx.cmake b/modules/v4d/third/bgfx.cmake new file mode 160000 index 000000000..f53151639 --- /dev/null +++ b/modules/v4d/third/bgfx.cmake @@ -0,0 +1 @@ +Subproject commit f531516396e7507f63c0e448543bc6d9bc546191 diff --git a/modules/v4d/third/doxygen-bootstrapped b/modules/v4d/third/doxygen-bootstrapped new file mode 160000 index 000000000..1a0c8794b --- /dev/null +++ b/modules/v4d/third/doxygen-bootstrapped @@ -0,0 +1 @@ +Subproject commit 1a0c8794b2a13a65ebec4653fc57f20dd83b2df2 diff --git a/modules/v4d/third/imgui b/modules/v4d/third/imgui new file mode 160000 index 000000000..a5c7e1b51 --- /dev/null +++ b/modules/v4d/third/imgui @@ -0,0 +1 @@ +Subproject commit a5c7e1b51c3df889f8668b78a58ef538dc2a1f4b diff --git a/modules/v4d/third/kallaballa/midiplayback.cpp b/modules/v4d/third/kallaballa/midiplayback.cpp new file mode 100644 index 000000000..b735cfb4b --- /dev/null +++ b/modules/v4d/third/kallaballa/midiplayback.cpp @@ -0,0 +1,120 @@ +#include "midiplayback.hpp" + +#include +#include +#include +#include + +using namespace std::chrono; + +std::vector::iterator findClosestEvent(std::vector & data, uint64_t key) +{ + if (data.size() == 0) { + throw std::out_of_range("Received empty vector."); + } + + std::vector::iterator lower = std::lower_bound(data.begin(), data.end(), key, [](const MidiEvent& lhs, const uint64_t& ts) { + return lhs.timestamp_ < ts; + }); + + if (lower == data.end()) // If none found, return the last one. + return data.end()-1; + + if (lower == data.begin()) + return lower; + + // Check which one is closest. + auto previous = std::prev(lower); + if ((key - (*previous).timestamp_) < ((*lower).timestamp_ - key)) + return previous; + + return lower; +} + + +MidiPlayback::MidiPlayback(int32_t inport) : recv_(inport, false) { +} + +MidiPlayback::~MidiPlayback(){ +} + +void MidiPlayback::record() { + std::unique_lock lock(bufferMtx_); + if(running_) + return; + + firstTimestamp_ = 0; + recordBuffer_.clear(); + running_ = true; + recv_.start(); + + std::thread t([&]() { + while(running_) { + std::this_thread::sleep_for(10ms); + std::vector events = recv_.receive(); + std::unique_lock lock(bufferMtx_); + if(events.empty()) + continue; + + recordBuffer_.insert(recordBuffer_.end(), events.begin(), events.end()); + } + }); + t.detach(); +} + +void MidiPlayback::stop() { + std::unique_lock lock(bufferMtx_); + recv_.stop(); + running_ = false; +} + +std::vector MidiPlayback::get_until_epoch(uint64_t sinceEpoch) { + std::unique_lock lock(bufferMtx_); + + if(recordBuffer_.empty()) { + return {}; + } + + if(firstTimestamp_ == 0) { + firstTimestamp_ = recordBuffer_.front().timestamp_; + } + + uint64_t timestamp = sinceEpoch; + + std::vector::iterator it = std::lower_bound(recordBuffer_.begin(), recordBuffer_.end(), timestamp, [](const MidiEvent& lhs, const uint64_t& ts) { + return lhs.timestamp_ < ts; + }); + + + std::vector queuedEvents; + queuedEvents.insert(queuedEvents.end(), recordBuffer_.begin(), it); + recordBuffer_.erase(recordBuffer_.begin(), it); + + return queuedEvents; +} + + +std::vector MidiPlayback::get_until_tick(uint64_t tick) { + std::unique_lock lock(bufferMtx_); + + if(recordBuffer_.empty()) { + return {}; + } + + if(firstTimestamp_ == 0) { + firstTimestamp_ = recordBuffer_.front().timestamp_; + } + + uint64_t timestamp = firstTimestamp_ + tick; + + std::vector::iterator it = std::lower_bound(recordBuffer_.begin(), recordBuffer_.end(), timestamp, [](const MidiEvent& lhs, const uint64_t& ts) { + return lhs.timestamp_ < ts; + }); + + + std::vector queuedEvents; + queuedEvents.insert(queuedEvents.end(), recordBuffer_.begin(), it); + recordBuffer_.erase(recordBuffer_.begin(), it); + + return queuedEvents; +} diff --git a/modules/v4d/third/kallaballa/midiplayback.hpp b/modules/v4d/third/kallaballa/midiplayback.hpp new file mode 100644 index 000000000..941fd69a8 --- /dev/null +++ b/modules/v4d/third/kallaballa/midiplayback.hpp @@ -0,0 +1,26 @@ +#ifndef SRC_LIB_MIDIPLAYBACK_HPP_ +#define SRC_LIB_MIDIPLAYBACK_HPP_ + +#include "midireceiver.hpp" +#include + +#include + +class MidiPlayback { + MidiReceiver recv_; + bool running_ = false; + uint64_t firstTimestamp_ = 0; + std::vector recordBuffer_; + std::mutex bufferMtx_; +public: + MidiPlayback(int32_t inport); + virtual ~MidiPlayback(); + void record(); + void stop(); + std::vector get_until_epoch(uint64_t epoch); + std::vector get_until_tick(uint64_t tick); +}; + + + +#endif /* SRC_LIB_MIDIPLAYBACK_HPP_ */ diff --git a/modules/v4d/third/kallaballa/midireceiver.cpp b/modules/v4d/third/kallaballa/midireceiver.cpp new file mode 100644 index 000000000..9667e5202 --- /dev/null +++ b/modules/v4d/third/kallaballa/midireceiver.cpp @@ -0,0 +1,75 @@ +#include "midireceiver.hpp" + +//#include + +std::vector* MidiReceiver::queue_ = new std::vector(); +std::mutex* MidiReceiver::evMtx_ = new std::mutex(); + +void midiCallback(double deltatime, std::vector* msg, void* userData) { + std::unique_lock lock(*MidiReceiver::evMtx_); + int nBytes; + nBytes = (*msg).size(); + MidiEvent ev; + + if (nBytes == 3) { + int mask = ((*msg)[0] & 240); + if(mask == 144 || mask == 128) { + ev.on_ = (mask == 144); + ev.channel_ = (*msg)[0] & 15; + ev.note_ = (*msg)[1]; + ev.velocity_ = (*msg)[2]; + if(ev.velocity_ == 0) + ev.on_ = false; + } else if(mask == 176) { + ev.cc_ = true; + ev.channel_ = (*msg)[0] & 15; + ev.controller_ = (*msg)[1]; + ev.value_ = (*msg)[2]; + } else + return; + } else if(nBytes == 1) { + ev.clock_ = true; + } + ev.timestamp_ = std::chrono::duration_cast(std::chrono::system_clock::now().time_since_epoch()).count(); +// std::cerr << "\tmr ts: " << ev.timestamp_ << std::endl; + MidiReceiver::queue_->push_back(ev); +} + +MidiReceiver::MidiReceiver(int32_t inport, bool autostart) : inport_(inport) { + midiin_->ignoreTypes(true, false, true); + if(autostart) + start(); +} + +MidiReceiver::~MidiReceiver() { + stop(); +} + +void MidiReceiver::start() { + midiin_->setCallback(midiCallback); + midiin_->openPort(inport_); +} + +void MidiReceiver::stop() { + midiin_->cancelCallback(); + midiin_->closePort(); +} + +void MidiReceiver::clear() { + std::unique_lock lock(*evMtx_); + queue_->clear(); +} + + +std::vector MidiReceiver::receive() { + std::unique_lock lock(*evMtx_); + std::vector ret = *queue_; + queue_->clear(); + return ret; +} + +std::ostream& operator<<(std::ostream &out, const MidiEvent& ev) { + out << ev.on_ << '\t' << ev.channel_ << '\t' << ev.note_ << '\t' << ev.velocity_ ; + return out; +} + diff --git a/modules/v4d/third/kallaballa/midireceiver.hpp b/modules/v4d/third/kallaballa/midireceiver.hpp new file mode 100644 index 000000000..2a564e40c --- /dev/null +++ b/modules/v4d/third/kallaballa/midireceiver.hpp @@ -0,0 +1,38 @@ +#ifndef SRC_SLIDE_MIDI_HPP_ +#define SRC_SLIDE_MIDI_HPP_ + +#include +#include +#include +#include + +struct MidiEvent { + bool on_ = false; + bool cc_ = false; + bool clock_ = false; + uint16_t note_ = 0; + uint16_t velocity_ = 0; + uint16_t channel_ = 0; + uint64_t timestamp_ = 0; + uint16_t controller_ = 0; + uint16_t value_ = 0; +}; + +std::ostream& operator<<(std::ostream &out, const MidiEvent& ev); +class MidiReceiver { +private: + RtMidiIn *midiin_ = new RtMidiIn(); + int32_t inport_; +public: + static std::vector* queue_; + static std::mutex* evMtx_; + MidiReceiver(int32_t inport, bool autostart = true); + virtual ~MidiReceiver(); + void start(); + void stop(); + void clear(); + + std::vector receive(); +}; + +#endif /* SRC_SLIDE_MIDI_HPP_ */ diff --git a/modules/v4d/third/nanovg b/modules/v4d/third/nanovg new file mode 160000 index 000000000..b2b7efa7d --- /dev/null +++ b/modules/v4d/third/nanovg @@ -0,0 +1 @@ +Subproject commit b2b7efa7d27fa8e047cb279000dc419574d2176b diff --git a/modules/v4d/third/stb/stb_image.h b/modules/v4d/third/stb/stb_image.h new file mode 100644 index 000000000..5e807a0a6 --- /dev/null +++ b/modules/v4d/third/stb/stb_image.h @@ -0,0 +1,7987 @@ +/* stb_image - v2.28 - public domain image loader - http://nothings.org/stb + no warranty implied; use at your own risk + + Do this: + #define STB_IMAGE_IMPLEMENTATION + before you include this file in *one* C or C++ file to create the implementation. + + // i.e. it should look like this: + #include ... + #include ... + #include ... + #define STB_IMAGE_IMPLEMENTATION + #include "stb_image.h" + + You can #define STBI_ASSERT(x) before the #include to avoid using assert.h. + And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free + + + QUICK NOTES: + Primarily of interest to game developers and other people who can + avoid problematic images and only need the trivial interface + + JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib) + PNG 1/2/4/8/16-bit-per-channel + + TGA (not sure what subset, if a subset) + BMP non-1bpp, non-RLE + PSD (composited view only, no extra channels, 8/16 bit-per-channel) + + GIF (*comp always reports as 4-channel) + HDR (radiance rgbE format) + PIC (Softimage PIC) + PNM (PPM and PGM binary only) + + Animated GIF still needs a proper API, but here's one way to do it: + http://gist.github.com/urraka/685d9a6340b26b830d49 + + - decode from memory or through FILE (define STBI_NO_STDIO to remove code) + - decode from arbitrary I/O callbacks + - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON) + + Full documentation under "DOCUMENTATION" below. + + +LICENSE + + See end of file for license information. + +RECENT REVISION HISTORY: + + 2.28 (2023-01-29) many error fixes, security errors, just tons of stuff + 2.27 (2021-07-11) document stbi_info better, 16-bit PNM support, bug fixes + 2.26 (2020-07-13) many minor fixes + 2.25 (2020-02-02) fix warnings + 2.24 (2020-02-02) fix warnings; thread-local failure_reason and flip_vertically + 2.23 (2019-08-11) fix clang static analysis warning + 2.22 (2019-03-04) gif fixes, fix warnings + 2.21 (2019-02-25) fix typo in comment + 2.20 (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs + 2.19 (2018-02-11) fix warning + 2.18 (2018-01-30) fix warnings + 2.17 (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings + 2.16 (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes + 2.15 (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC + 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs + 2.13 (2016-12-04) experimental 16-bit API, only for PNG so far; fixes + 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes + 2.11 (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64 + RGB-format JPEG; remove white matting in PSD; + allocate large structures on the stack; + correct channel count for PNG & BMP + 2.10 (2016-01-22) avoid warning introduced in 2.09 + 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED + + See end of file for full revision history. + + + ============================ Contributors ========================= + + Image formats Extensions, features + Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info) + Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info) + Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG) + Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks) + Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG) + Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip) + Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD) + github:urraka (animated gif) Junggon Kim (PNM comments) + Christopher Forseth (animated gif) Daniel Gibson (16-bit TGA) + socks-the-fox (16-bit PNG) + Jeremy Sawicki (handle all ImageNet JPGs) + Optimizations & bugfixes Mikhail Morozov (1-bit BMP) + Fabian "ryg" Giesen Anael Seghezzi (is-16-bit query) + Arseny Kapoulkine Simon Breuss (16-bit PNM) + John-Mark Allen + Carmelo J Fdez-Aguera + + Bug & warning fixes + Marc LeBlanc David Woo Guillaume George Martins Mozeiko + Christpher Lloyd Jerry Jansson Joseph Thomson Blazej Dariusz Roszkowski + Phil Jordan Dave Moore Roy Eltham + Hayaki Saito Nathan Reed Won Chun + Luke Graham Johan Duparc Nick Verigakis the Horde3D community + Thomas Ruf Ronny Chevalier github:rlyeh + Janez Zemva John Bartholomew Michal Cichon github:romigrou + Jonathan Blow Ken Hamada Tero Hanninen github:svdijk + Eugene Golushkov Laurent Gomila Cort Stratton github:snagar + Aruelien Pocheville Sergio Gonzalez Thibault Reuille github:Zelex + Cass Everitt Ryamond Barbiero github:grim210 + Paul Du Bois Engin Manap Aldo Culquicondor github:sammyhw + Philipp Wiesemann Dale Weiler Oriol Ferrer Mesia github:phprus + Josh Tobin Neil Bickford Matthew Gregan github:poppolopoppo + Julian Raschke Gregory Mullen Christian Floisand github:darealshinji + Baldur Karlsson Kevin Schmidt JR Smith github:Michaelangel007 + Brad Weinberger Matvey Cherevko github:mosra + Luca Sas Alexander Veselov Zack Middleton [reserved] + Ryan C. Gordon [reserved] [reserved] + DO NOT ADD YOUR NAME HERE + + Jacko Dirks + + To add your name to the credits, pick a random blank space in the middle and fill it. + 80% of merge conflicts on stb PRs are due to people adding their name at the end + of the credits. +*/ + +#ifndef STBI_INCLUDE_STB_IMAGE_H +#define STBI_INCLUDE_STB_IMAGE_H + +// DOCUMENTATION +// +// Limitations: +// - no 12-bit-per-channel JPEG +// - no JPEGs with arithmetic coding +// - GIF always returns *comp=4 +// +// Basic usage (see HDR discussion below for HDR usage): +// int x,y,n; +// unsigned char *data = stbi_load(filename, &x, &y, &n, 0); +// // ... process data if not NULL ... +// // ... x = width, y = height, n = # 8-bit components per pixel ... +// // ... replace '0' with '1'..'4' to force that many components per pixel +// // ... but 'n' will always be the number that it would have been if you said 0 +// stbi_image_free(data); +// +// Standard parameters: +// int *x -- outputs image width in pixels +// int *y -- outputs image height in pixels +// int *channels_in_file -- outputs # of image components in image file +// int desired_channels -- if non-zero, # of image components requested in result +// +// The return value from an image loader is an 'unsigned char *' which points +// to the pixel data, or NULL on an allocation failure or if the image is +// corrupt or invalid. The pixel data consists of *y scanlines of *x pixels, +// with each pixel consisting of N interleaved 8-bit components; the first +// pixel pointed to is top-left-most in the image. There is no padding between +// image scanlines or between pixels, regardless of format. The number of +// components N is 'desired_channels' if desired_channels is non-zero, or +// *channels_in_file otherwise. If desired_channels is non-zero, +// *channels_in_file has the number of components that _would_ have been +// output otherwise. E.g. if you set desired_channels to 4, you will always +// get RGBA output, but you can check *channels_in_file to see if it's trivially +// opaque because e.g. there were only 3 channels in the source image. +// +// An output image with N components has the following components interleaved +// in this order in each pixel: +// +// N=#comp components +// 1 grey +// 2 grey, alpha +// 3 red, green, blue +// 4 red, green, blue, alpha +// +// If image loading fails for any reason, the return value will be NULL, +// and *x, *y, *channels_in_file will be unchanged. The function +// stbi_failure_reason() can be queried for an extremely brief, end-user +// unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS +// to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly +// more user-friendly ones. +// +// Paletted PNG, BMP, GIF, and PIC images are automatically depalettized. +// +// To query the width, height and component count of an image without having to +// decode the full file, you can use the stbi_info family of functions: +// +// int x,y,n,ok; +// ok = stbi_info(filename, &x, &y, &n); +// // returns ok=1 and sets x, y, n if image is a supported format, +// // 0 otherwise. +// +// Note that stb_image pervasively uses ints in its public API for sizes, +// including sizes of memory buffers. This is now part of the API and thus +// hard to change without causing breakage. As a result, the various image +// loaders all have certain limits on image size; these differ somewhat +// by format but generally boil down to either just under 2GB or just under +// 1GB. When the decoded image would be larger than this, stb_image decoding +// will fail. +// +// Additionally, stb_image will reject image files that have any of their +// dimensions set to a larger value than the configurable STBI_MAX_DIMENSIONS, +// which defaults to 2**24 = 16777216 pixels. Due to the above memory limit, +// the only way to have an image with such dimensions load correctly +// is for it to have a rather extreme aspect ratio. Either way, the +// assumption here is that such larger images are likely to be malformed +// or malicious. If you do need to load an image with individual dimensions +// larger than that, and it still fits in the overall size limit, you can +// #define STBI_MAX_DIMENSIONS on your own to be something larger. +// +// =========================================================================== +// +// UNICODE: +// +// If compiling for Windows and you wish to use Unicode filenames, compile +// with +// #define STBI_WINDOWS_UTF8 +// and pass utf8-encoded filenames. Call stbi_convert_wchar_to_utf8 to convert +// Windows wchar_t filenames to utf8. +// +// =========================================================================== +// +// Philosophy +// +// stb libraries are designed with the following priorities: +// +// 1. easy to use +// 2. easy to maintain +// 3. good performance +// +// Sometimes I let "good performance" creep up in priority over "easy to maintain", +// and for best performance I may provide less-easy-to-use APIs that give higher +// performance, in addition to the easy-to-use ones. Nevertheless, it's important +// to keep in mind that from the standpoint of you, a client of this library, +// all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all. +// +// Some secondary priorities arise directly from the first two, some of which +// provide more explicit reasons why performance can't be emphasized. +// +// - Portable ("ease of use") +// - Small source code footprint ("easy to maintain") +// - No dependencies ("ease of use") +// +// =========================================================================== +// +// I/O callbacks +// +// I/O callbacks allow you to read from arbitrary sources, like packaged +// files or some other source. Data read from callbacks are processed +// through a small internal buffer (currently 128 bytes) to try to reduce +// overhead. +// +// The three functions you must define are "read" (reads some bytes of data), +// "skip" (skips some bytes of data), "eof" (reports if the stream is at the end). +// +// =========================================================================== +// +// SIMD support +// +// The JPEG decoder will try to automatically use SIMD kernels on x86 when +// supported by the compiler. For ARM Neon support, you must explicitly +// request it. +// +// (The old do-it-yourself SIMD API is no longer supported in the current +// code.) +// +// On x86, SSE2 will automatically be used when available based on a run-time +// test; if not, the generic C versions are used as a fall-back. On ARM targets, +// the typical path is to have separate builds for NEON and non-NEON devices +// (at least this is true for iOS and Android). Therefore, the NEON support is +// toggled by a build flag: define STBI_NEON to get NEON loops. +// +// If for some reason you do not want to use any of SIMD code, or if +// you have issues compiling it, you can disable it entirely by +// defining STBI_NO_SIMD. +// +// =========================================================================== +// +// HDR image support (disable by defining STBI_NO_HDR) +// +// stb_image supports loading HDR images in general, and currently the Radiance +// .HDR file format specifically. You can still load any file through the existing +// interface; if you attempt to load an HDR file, it will be automatically remapped +// to LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1; +// both of these constants can be reconfigured through this interface: +// +// stbi_hdr_to_ldr_gamma(2.2f); +// stbi_hdr_to_ldr_scale(1.0f); +// +// (note, do not use _inverse_ constants; stbi_image will invert them +// appropriately). +// +// Additionally, there is a new, parallel interface for loading files as +// (linear) floats to preserve the full dynamic range: +// +// float *data = stbi_loadf(filename, &x, &y, &n, 0); +// +// If you load LDR images through this interface, those images will +// be promoted to floating point values, run through the inverse of +// constants corresponding to the above: +// +// stbi_ldr_to_hdr_scale(1.0f); +// stbi_ldr_to_hdr_gamma(2.2f); +// +// Finally, given a filename (or an open file or memory block--see header +// file for details) containing image data, you can query for the "most +// appropriate" interface to use (that is, whether the image is HDR or +// not), using: +// +// stbi_is_hdr(char *filename); +// +// =========================================================================== +// +// iPhone PNG support: +// +// We optionally support converting iPhone-formatted PNGs (which store +// premultiplied BGRA) back to RGB, even though they're internally encoded +// differently. To enable this conversion, call +// stbi_convert_iphone_png_to_rgb(1). +// +// Call stbi_set_unpremultiply_on_load(1) as well to force a divide per +// pixel to remove any premultiplied alpha *only* if the image file explicitly +// says there's premultiplied data (currently only happens in iPhone images, +// and only if iPhone convert-to-rgb processing is on). +// +// =========================================================================== +// +// ADDITIONAL CONFIGURATION +// +// - You can suppress implementation of any of the decoders to reduce +// your code footprint by #defining one or more of the following +// symbols before creating the implementation. +// +// STBI_NO_JPEG +// STBI_NO_PNG +// STBI_NO_BMP +// STBI_NO_PSD +// STBI_NO_TGA +// STBI_NO_GIF +// STBI_NO_HDR +// STBI_NO_PIC +// STBI_NO_PNM (.ppm and .pgm) +// +// - You can request *only* certain decoders and suppress all other ones +// (this will be more forward-compatible, as addition of new decoders +// doesn't require you to disable them explicitly): +// +// STBI_ONLY_JPEG +// STBI_ONLY_PNG +// STBI_ONLY_BMP +// STBI_ONLY_PSD +// STBI_ONLY_TGA +// STBI_ONLY_GIF +// STBI_ONLY_HDR +// STBI_ONLY_PIC +// STBI_ONLY_PNM (.ppm and .pgm) +// +// - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still +// want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB +// +// - If you define STBI_MAX_DIMENSIONS, stb_image will reject images greater +// than that size (in either width or height) without further processing. +// This is to let programs in the wild set an upper bound to prevent +// denial-of-service attacks on untrusted data, as one could generate a +// valid image of gigantic dimensions and force stb_image to allocate a +// huge block of memory and spend disproportionate time decoding it. By +// default this is set to (1 << 24), which is 16777216, but that's still +// very big. + +#ifndef STBI_NO_STDIO +#include +#endif // STBI_NO_STDIO + +#define STBI_VERSION 1 + +enum +{ + STBI_default = 0, // only used for desired_channels + + STBI_grey = 1, + STBI_grey_alpha = 2, + STBI_rgb = 3, + STBI_rgb_alpha = 4 +}; + +#include +typedef unsigned char stbi_uc; +typedef unsigned short stbi_us; + +#ifdef __cplusplus +extern "C" { +#endif + +#ifndef STBIDEF +#ifdef STB_IMAGE_STATIC +#define STBIDEF static +#else +#define STBIDEF extern +#endif +#endif + +////////////////////////////////////////////////////////////////////////////// +// +// PRIMARY API - works on images of any type +// + +// +// load image by filename, open file, or memory buffer +// + +typedef struct +{ + int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read + void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative + int (*eof) (void *user); // returns nonzero if we are at end of file/data +} stbi_io_callbacks; + +//////////////////////////////////// +// +// 8-bits-per-channel interface +// + +STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *channels_in_file, int desired_channels); +STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *channels_in_file, int desired_channels); + +#ifndef STBI_NO_STDIO +STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels); +STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels); +// for stbi_load_from_file, file pointer is left pointing immediately after image +#endif + +#ifndef STBI_NO_GIF +STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp); +#endif + +#ifdef STBI_WINDOWS_UTF8 +STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input); +#endif + +//////////////////////////////////// +// +// 16-bits-per-channel interface +// + +STBIDEF stbi_us *stbi_load_16_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels); +STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels); + +#ifndef STBI_NO_STDIO +STBIDEF stbi_us *stbi_load_16 (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels); +STBIDEF stbi_us *stbi_load_from_file_16(FILE *f, int *x, int *y, int *channels_in_file, int desired_channels); +#endif + +//////////////////////////////////// +// +// float-per-channel interface +// +#ifndef STBI_NO_LINEAR + STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels); + STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels); + + #ifndef STBI_NO_STDIO + STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels); + STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels); + #endif +#endif + +#ifndef STBI_NO_HDR + STBIDEF void stbi_hdr_to_ldr_gamma(float gamma); + STBIDEF void stbi_hdr_to_ldr_scale(float scale); +#endif // STBI_NO_HDR + +#ifndef STBI_NO_LINEAR + STBIDEF void stbi_ldr_to_hdr_gamma(float gamma); + STBIDEF void stbi_ldr_to_hdr_scale(float scale); +#endif // STBI_NO_LINEAR + +// stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR +STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user); +STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len); +#ifndef STBI_NO_STDIO +STBIDEF int stbi_is_hdr (char const *filename); +STBIDEF int stbi_is_hdr_from_file(FILE *f); +#endif // STBI_NO_STDIO + + +// get a VERY brief reason for failure +// on most compilers (and ALL modern mainstream compilers) this is threadsafe +STBIDEF const char *stbi_failure_reason (void); + +// free the loaded image -- this is just free() +STBIDEF void stbi_image_free (void *retval_from_stbi_load); + +// get image dimensions & components without fully decoding +STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp); +STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp); +STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len); +STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *clbk, void *user); + +#ifndef STBI_NO_STDIO +STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp); +STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp); +STBIDEF int stbi_is_16_bit (char const *filename); +STBIDEF int stbi_is_16_bit_from_file(FILE *f); +#endif + + + +// for image formats that explicitly notate that they have premultiplied alpha, +// we just return the colors as stored in the file. set this flag to force +// unpremultiplication. results are undefined if the unpremultiply overflow. +STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply); + +// indicate whether we should process iphone images back to canonical format, +// or just pass them through "as-is" +STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert); + +// flip the image vertically, so the first pixel in the output array is the bottom left +STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip); + +// as above, but only applies to images loaded on the thread that calls the function +// this function is only available if your compiler supports thread-local variables; +// calling it will fail to link if your compiler doesn't +STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply); +STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert); +STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip); + +// ZLIB client - used by PNG, available for other purposes + +STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen); +STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header); +STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen); +STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen); + +STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen); +STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen); + + +#ifdef __cplusplus +} +#endif + +// +// +//// end header file ///////////////////////////////////////////////////// +#endif // STBI_INCLUDE_STB_IMAGE_H + +#ifdef STB_IMAGE_IMPLEMENTATION + +#if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \ + || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \ + || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \ + || defined(STBI_ONLY_ZLIB) + #ifndef STBI_ONLY_JPEG + #define STBI_NO_JPEG + #endif + #ifndef STBI_ONLY_PNG + #define STBI_NO_PNG + #endif + #ifndef STBI_ONLY_BMP + #define STBI_NO_BMP + #endif + #ifndef STBI_ONLY_PSD + #define STBI_NO_PSD + #endif + #ifndef STBI_ONLY_TGA + #define STBI_NO_TGA + #endif + #ifndef STBI_ONLY_GIF + #define STBI_NO_GIF + #endif + #ifndef STBI_ONLY_HDR + #define STBI_NO_HDR + #endif + #ifndef STBI_ONLY_PIC + #define STBI_NO_PIC + #endif + #ifndef STBI_ONLY_PNM + #define STBI_NO_PNM + #endif +#endif + +#if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB) +#define STBI_NO_ZLIB +#endif + + +#include +#include // ptrdiff_t on osx +#include +#include +#include + +#if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) +#include // ldexp, pow +#endif + +#ifndef STBI_NO_STDIO +#include +#endif + +#ifndef STBI_ASSERT +#include +#define STBI_ASSERT(x) assert(x) +#endif + +#ifdef __cplusplus +#define STBI_EXTERN extern "C" +#else +#define STBI_EXTERN extern +#endif + + +#ifndef _MSC_VER + #ifdef __cplusplus + #define stbi_inline inline + #else + #define stbi_inline + #endif +#else + #define stbi_inline __forceinline +#endif + +#ifndef STBI_NO_THREAD_LOCALS + #if defined(__cplusplus) && __cplusplus >= 201103L + #define STBI_THREAD_LOCAL thread_local + #elif defined(__GNUC__) && __GNUC__ < 5 + #define STBI_THREAD_LOCAL __thread + #elif defined(_MSC_VER) + #define STBI_THREAD_LOCAL __declspec(thread) + #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_THREADS__) + #define STBI_THREAD_LOCAL _Thread_local + #endif + + #ifndef STBI_THREAD_LOCAL + #if defined(__GNUC__) + #define STBI_THREAD_LOCAL __thread + #endif + #endif +#endif + +#if defined(_MSC_VER) || defined(__SYMBIAN32__) +typedef unsigned short stbi__uint16; +typedef signed short stbi__int16; +typedef unsigned int stbi__uint32; +typedef signed int stbi__int32; +#else +#include +typedef uint16_t stbi__uint16; +typedef int16_t stbi__int16; +typedef uint32_t stbi__uint32; +typedef int32_t stbi__int32; +#endif + +// should produce compiler error if size is wrong +typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1]; + +#ifdef _MSC_VER +#define STBI_NOTUSED(v) (void)(v) +#else +#define STBI_NOTUSED(v) (void)sizeof(v) +#endif + +#ifdef _MSC_VER +#define STBI_HAS_LROTL +#endif + +#ifdef STBI_HAS_LROTL + #define stbi_lrot(x,y) _lrotl(x,y) +#else + #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (-(y) & 31))) +#endif + +#if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED)) +// ok +#elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED) +// ok +#else +#error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)." +#endif + +#ifndef STBI_MALLOC +#define STBI_MALLOC(sz) malloc(sz) +#define STBI_REALLOC(p,newsz) realloc(p,newsz) +#define STBI_FREE(p) free(p) +#endif + +#ifndef STBI_REALLOC_SIZED +#define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz) +#endif + +// x86/x64 detection +#if defined(__x86_64__) || defined(_M_X64) +#define STBI__X64_TARGET +#elif defined(__i386) || defined(_M_IX86) +#define STBI__X86_TARGET +#endif + +#if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD) +// gcc doesn't support sse2 intrinsics unless you compile with -msse2, +// which in turn means it gets to use SSE2 everywhere. This is unfortunate, +// but previous attempts to provide the SSE2 functions with runtime +// detection caused numerous issues. The way architecture extensions are +// exposed in GCC/Clang is, sadly, not really suited for one-file libs. +// New behavior: if compiled with -msse2, we use SSE2 without any +// detection; if not, we don't use it at all. +#define STBI_NO_SIMD +#endif + +#if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD) +// Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET +// +// 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the +// Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant. +// As a result, enabling SSE2 on 32-bit MinGW is dangerous when not +// simultaneously enabling "-mstackrealign". +// +// See https://github.com/nothings/stb/issues/81 for more information. +// +// So default to no SSE2 on 32-bit MinGW. If you've read this far and added +// -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2. +#define STBI_NO_SIMD +#endif + +#if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) +#define STBI_SSE2 +#include + +#ifdef _MSC_VER + +#if _MSC_VER >= 1400 // not VC6 +#include // __cpuid +static int stbi__cpuid3(void) +{ + int info[4]; + __cpuid(info,1); + return info[3]; +} +#else +static int stbi__cpuid3(void) +{ + int res; + __asm { + mov eax,1 + cpuid + mov res,edx + } + return res; +} +#endif + +#define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name + +#if !defined(STBI_NO_JPEG) && defined(STBI_SSE2) +static int stbi__sse2_available(void) +{ + int info3 = stbi__cpuid3(); + return ((info3 >> 26) & 1) != 0; +} +#endif + +#else // assume GCC-style if not VC++ +#define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16))) + +#if !defined(STBI_NO_JPEG) && defined(STBI_SSE2) +static int stbi__sse2_available(void) +{ + // If we're even attempting to compile this on GCC/Clang, that means + // -msse2 is on, which means the compiler is allowed to use SSE2 + // instructions at will, and so are we. + return 1; +} +#endif + +#endif +#endif + +// ARM NEON +#if defined(STBI_NO_SIMD) && defined(STBI_NEON) +#undef STBI_NEON +#endif + +#ifdef STBI_NEON +#include +#ifdef _MSC_VER +#define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name +#else +#define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16))) +#endif +#endif + +#ifndef STBI_SIMD_ALIGN +#define STBI_SIMD_ALIGN(type, name) type name +#endif + +#ifndef STBI_MAX_DIMENSIONS +#define STBI_MAX_DIMENSIONS (1 << 24) +#endif + +/////////////////////////////////////////////// +// +// stbi__context struct and start_xxx functions + +// stbi__context structure is our basic context used by all images, so it +// contains all the IO context, plus some basic image information +typedef struct +{ + stbi__uint32 img_x, img_y; + int img_n, img_out_n; + + stbi_io_callbacks io; + void *io_user_data; + + int read_from_callbacks; + int buflen; + stbi_uc buffer_start[128]; + int callback_already_read; + + stbi_uc *img_buffer, *img_buffer_end; + stbi_uc *img_buffer_original, *img_buffer_original_end; +} stbi__context; + + +static void stbi__refill_buffer(stbi__context *s); + +// initialize a memory-decode context +static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len) +{ + s->io.read = NULL; + s->read_from_callbacks = 0; + s->callback_already_read = 0; + s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer; + s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len; +} + +// initialize a callback-based context +static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user) +{ + s->io = *c; + s->io_user_data = user; + s->buflen = sizeof(s->buffer_start); + s->read_from_callbacks = 1; + s->callback_already_read = 0; + s->img_buffer = s->img_buffer_original = s->buffer_start; + stbi__refill_buffer(s); + s->img_buffer_original_end = s->img_buffer_end; +} + +#ifndef STBI_NO_STDIO + +static int stbi__stdio_read(void *user, char *data, int size) +{ + return (int) fread(data,1,size,(FILE*) user); +} + +static void stbi__stdio_skip(void *user, int n) +{ + int ch; + fseek((FILE*) user, n, SEEK_CUR); + ch = fgetc((FILE*) user); /* have to read a byte to reset feof()'s flag */ + if (ch != EOF) { + ungetc(ch, (FILE *) user); /* push byte back onto stream if valid. */ + } +} + +static int stbi__stdio_eof(void *user) +{ + return feof((FILE*) user) || ferror((FILE *) user); +} + +static stbi_io_callbacks stbi__stdio_callbacks = +{ + stbi__stdio_read, + stbi__stdio_skip, + stbi__stdio_eof, +}; + +static void stbi__start_file(stbi__context *s, FILE *f) +{ + stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f); +} + +//static void stop_file(stbi__context *s) { } + +#endif // !STBI_NO_STDIO + +static void stbi__rewind(stbi__context *s) +{ + // conceptually rewind SHOULD rewind to the beginning of the stream, + // but we just rewind to the beginning of the initial buffer, because + // we only use it after doing 'test', which only ever looks at at most 92 bytes + s->img_buffer = s->img_buffer_original; + s->img_buffer_end = s->img_buffer_original_end; +} + +enum +{ + STBI_ORDER_RGB, + STBI_ORDER_BGR +}; + +typedef struct +{ + int bits_per_channel; + int num_channels; + int channel_order; +} stbi__result_info; + +#ifndef STBI_NO_JPEG +static int stbi__jpeg_test(stbi__context *s); +static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp); +#endif + +#ifndef STBI_NO_PNG +static int stbi__png_test(stbi__context *s); +static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp); +static int stbi__png_is16(stbi__context *s); +#endif + +#ifndef STBI_NO_BMP +static int stbi__bmp_test(stbi__context *s); +static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp); +#endif + +#ifndef STBI_NO_TGA +static int stbi__tga_test(stbi__context *s); +static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp); +#endif + +#ifndef STBI_NO_PSD +static int stbi__psd_test(stbi__context *s); +static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc); +static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp); +static int stbi__psd_is16(stbi__context *s); +#endif + +#ifndef STBI_NO_HDR +static int stbi__hdr_test(stbi__context *s); +static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp); +#endif + +#ifndef STBI_NO_PIC +static int stbi__pic_test(stbi__context *s); +static void *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp); +#endif + +#ifndef STBI_NO_GIF +static int stbi__gif_test(stbi__context *s); +static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp); +static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp); +#endif + +#ifndef STBI_NO_PNM +static int stbi__pnm_test(stbi__context *s); +static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri); +static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp); +static int stbi__pnm_is16(stbi__context *s); +#endif + +static +#ifdef STBI_THREAD_LOCAL +STBI_THREAD_LOCAL +#endif +const char *stbi__g_failure_reason; + +STBIDEF const char *stbi_failure_reason(void) +{ + return stbi__g_failure_reason; +} + +#ifndef STBI_NO_FAILURE_STRINGS +static int stbi__err(const char *str) +{ + stbi__g_failure_reason = str; + return 0; +} +#endif + +static void *stbi__malloc(size_t size) +{ + return STBI_MALLOC(size); +} + +// stb_image uses ints pervasively, including for offset calculations. +// therefore the largest decoded image size we can support with the +// current code, even on 64-bit targets, is INT_MAX. this is not a +// significant limitation for the intended use case. +// +// we do, however, need to make sure our size calculations don't +// overflow. hence a few helper functions for size calculations that +// multiply integers together, making sure that they're non-negative +// and no overflow occurs. + +// return 1 if the sum is valid, 0 on overflow. +// negative terms are considered invalid. +static int stbi__addsizes_valid(int a, int b) +{ + if (b < 0) return 0; + // now 0 <= b <= INT_MAX, hence also + // 0 <= INT_MAX - b <= INTMAX. + // And "a + b <= INT_MAX" (which might overflow) is the + // same as a <= INT_MAX - b (no overflow) + return a <= INT_MAX - b; +} + +// returns 1 if the product is valid, 0 on overflow. +// negative factors are considered invalid. +static int stbi__mul2sizes_valid(int a, int b) +{ + if (a < 0 || b < 0) return 0; + if (b == 0) return 1; // mul-by-0 is always safe + // portable way to check for no overflows in a*b + return a <= INT_MAX/b; +} + +#if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR) +// returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow +static int stbi__mad2sizes_valid(int a, int b, int add) +{ + return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a*b, add); +} +#endif + +// returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow +static int stbi__mad3sizes_valid(int a, int b, int c, int add) +{ + return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) && + stbi__addsizes_valid(a*b*c, add); +} + +// returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow +#if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM) +static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add) +{ + return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) && + stbi__mul2sizes_valid(a*b*c, d) && stbi__addsizes_valid(a*b*c*d, add); +} +#endif + +#if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR) +// mallocs with size overflow checking +static void *stbi__malloc_mad2(int a, int b, int add) +{ + if (!stbi__mad2sizes_valid(a, b, add)) return NULL; + return stbi__malloc(a*b + add); +} +#endif + +static void *stbi__malloc_mad3(int a, int b, int c, int add) +{ + if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL; + return stbi__malloc(a*b*c + add); +} + +#if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM) +static void *stbi__malloc_mad4(int a, int b, int c, int d, int add) +{ + if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL; + return stbi__malloc(a*b*c*d + add); +} +#endif + +// returns 1 if the sum of two signed ints is valid (between -2^31 and 2^31-1 inclusive), 0 on overflow. +static int stbi__addints_valid(int a, int b) +{ + if ((a >= 0) != (b >= 0)) return 1; // a and b have different signs, so no overflow + if (a < 0 && b < 0) return a >= INT_MIN - b; // same as a + b >= INT_MIN; INT_MIN - b cannot overflow since b < 0. + return a <= INT_MAX - b; +} + +// returns 1 if the product of two signed shorts is valid, 0 on overflow. +static int stbi__mul2shorts_valid(short a, short b) +{ + if (b == 0 || b == -1) return 1; // multiplication by 0 is always 0; check for -1 so SHRT_MIN/b doesn't overflow + if ((a >= 0) == (b >= 0)) return a <= SHRT_MAX/b; // product is positive, so similar to mul2sizes_valid + if (b < 0) return a <= SHRT_MIN / b; // same as a * b >= SHRT_MIN + return a >= SHRT_MIN / b; +} + +// stbi__err - error +// stbi__errpf - error returning pointer to float +// stbi__errpuc - error returning pointer to unsigned char + +#ifdef STBI_NO_FAILURE_STRINGS + #define stbi__err(x,y) 0 +#elif defined(STBI_FAILURE_USERMSG) + #define stbi__err(x,y) stbi__err(y) +#else + #define stbi__err(x,y) stbi__err(x) +#endif + +#define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL)) +#define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL)) + +STBIDEF void stbi_image_free(void *retval_from_stbi_load) +{ + STBI_FREE(retval_from_stbi_load); +} + +#ifndef STBI_NO_LINEAR +static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp); +#endif + +#ifndef STBI_NO_HDR +static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp); +#endif + +static int stbi__vertically_flip_on_load_global = 0; + +STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip) +{ + stbi__vertically_flip_on_load_global = flag_true_if_should_flip; +} + +#ifndef STBI_THREAD_LOCAL +#define stbi__vertically_flip_on_load stbi__vertically_flip_on_load_global +#else +static STBI_THREAD_LOCAL int stbi__vertically_flip_on_load_local, stbi__vertically_flip_on_load_set; + +STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip) +{ + stbi__vertically_flip_on_load_local = flag_true_if_should_flip; + stbi__vertically_flip_on_load_set = 1; +} + +#define stbi__vertically_flip_on_load (stbi__vertically_flip_on_load_set \ + ? stbi__vertically_flip_on_load_local \ + : stbi__vertically_flip_on_load_global) +#endif // STBI_THREAD_LOCAL + +static void *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc) +{ + memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields + ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed + ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order + ri->num_channels = 0; + + // test the formats with a very explicit header first (at least a FOURCC + // or distinctive magic number first) + #ifndef STBI_NO_PNG + if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp, ri); + #endif + #ifndef STBI_NO_BMP + if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp, ri); + #endif + #ifndef STBI_NO_GIF + if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp, ri); + #endif + #ifndef STBI_NO_PSD + if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp, ri, bpc); + #else + STBI_NOTUSED(bpc); + #endif + #ifndef STBI_NO_PIC + if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp, ri); + #endif + + // then the formats that can end up attempting to load with just 1 or 2 + // bytes matching expectations; these are prone to false positives, so + // try them later + #ifndef STBI_NO_JPEG + if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp, ri); + #endif + #ifndef STBI_NO_PNM + if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp, ri); + #endif + + #ifndef STBI_NO_HDR + if (stbi__hdr_test(s)) { + float *hdr = stbi__hdr_load(s, x,y,comp,req_comp, ri); + return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp); + } + #endif + + #ifndef STBI_NO_TGA + // test tga last because it's a crappy test! + if (stbi__tga_test(s)) + return stbi__tga_load(s,x,y,comp,req_comp, ri); + #endif + + return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt"); +} + +static stbi_uc *stbi__convert_16_to_8(stbi__uint16 *orig, int w, int h, int channels) +{ + int i; + int img_len = w * h * channels; + stbi_uc *reduced; + + reduced = (stbi_uc *) stbi__malloc(img_len); + if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory"); + + for (i = 0; i < img_len; ++i) + reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling + + STBI_FREE(orig); + return reduced; +} + +static stbi__uint16 *stbi__convert_8_to_16(stbi_uc *orig, int w, int h, int channels) +{ + int i; + int img_len = w * h * channels; + stbi__uint16 *enlarged; + + enlarged = (stbi__uint16 *) stbi__malloc(img_len*2); + if (enlarged == NULL) return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory"); + + for (i = 0; i < img_len; ++i) + enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff + + STBI_FREE(orig); + return enlarged; +} + +static void stbi__vertical_flip(void *image, int w, int h, int bytes_per_pixel) +{ + int row; + size_t bytes_per_row = (size_t)w * bytes_per_pixel; + stbi_uc temp[2048]; + stbi_uc *bytes = (stbi_uc *)image; + + for (row = 0; row < (h>>1); row++) { + stbi_uc *row0 = bytes + row*bytes_per_row; + stbi_uc *row1 = bytes + (h - row - 1)*bytes_per_row; + // swap row0 with row1 + size_t bytes_left = bytes_per_row; + while (bytes_left) { + size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp); + memcpy(temp, row0, bytes_copy); + memcpy(row0, row1, bytes_copy); + memcpy(row1, temp, bytes_copy); + row0 += bytes_copy; + row1 += bytes_copy; + bytes_left -= bytes_copy; + } + } +} + +#ifndef STBI_NO_GIF +static void stbi__vertical_flip_slices(void *image, int w, int h, int z, int bytes_per_pixel) +{ + int slice; + int slice_size = w * h * bytes_per_pixel; + + stbi_uc *bytes = (stbi_uc *)image; + for (slice = 0; slice < z; ++slice) { + stbi__vertical_flip(bytes, w, h, bytes_per_pixel); + bytes += slice_size; + } +} +#endif + +static unsigned char *stbi__load_and_postprocess_8bit(stbi__context *s, int *x, int *y, int *comp, int req_comp) +{ + stbi__result_info ri; + void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8); + + if (result == NULL) + return NULL; + + // it is the responsibility of the loaders to make sure we get either 8 or 16 bit. + STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16); + + if (ri.bits_per_channel != 8) { + result = stbi__convert_16_to_8((stbi__uint16 *) result, *x, *y, req_comp == 0 ? *comp : req_comp); + ri.bits_per_channel = 8; + } + + // @TODO: move stbi__convert_format to here + + if (stbi__vertically_flip_on_load) { + int channels = req_comp ? req_comp : *comp; + stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc)); + } + + return (unsigned char *) result; +} + +static stbi__uint16 *stbi__load_and_postprocess_16bit(stbi__context *s, int *x, int *y, int *comp, int req_comp) +{ + stbi__result_info ri; + void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16); + + if (result == NULL) + return NULL; + + // it is the responsibility of the loaders to make sure we get either 8 or 16 bit. + STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16); + + if (ri.bits_per_channel != 16) { + result = stbi__convert_8_to_16((stbi_uc *) result, *x, *y, req_comp == 0 ? *comp : req_comp); + ri.bits_per_channel = 16; + } + + // @TODO: move stbi__convert_format16 to here + // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision + + if (stbi__vertically_flip_on_load) { + int channels = req_comp ? req_comp : *comp; + stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16)); + } + + return (stbi__uint16 *) result; +} + +#if !defined(STBI_NO_HDR) && !defined(STBI_NO_LINEAR) +static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp) +{ + if (stbi__vertically_flip_on_load && result != NULL) { + int channels = req_comp ? req_comp : *comp; + stbi__vertical_flip(result, *x, *y, channels * sizeof(float)); + } +} +#endif + +#ifndef STBI_NO_STDIO + +#if defined(_WIN32) && defined(STBI_WINDOWS_UTF8) +STBI_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide); +STBI_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default); +#endif + +#if defined(_WIN32) && defined(STBI_WINDOWS_UTF8) +STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input) +{ + return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL); +} +#endif + +static FILE *stbi__fopen(char const *filename, char const *mode) +{ + FILE *f; +#if defined(_WIN32) && defined(STBI_WINDOWS_UTF8) + wchar_t wMode[64]; + wchar_t wFilename[1024]; + if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)/sizeof(*wFilename))) + return 0; + + if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)/sizeof(*wMode))) + return 0; + +#if defined(_MSC_VER) && _MSC_VER >= 1400 + if (0 != _wfopen_s(&f, wFilename, wMode)) + f = 0; +#else + f = _wfopen(wFilename, wMode); +#endif + +#elif defined(_MSC_VER) && _MSC_VER >= 1400 + if (0 != fopen_s(&f, filename, mode)) + f=0; +#else + f = fopen(filename, mode); +#endif + return f; +} + + +STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp) +{ + FILE *f = stbi__fopen(filename, "rb"); + unsigned char *result; + if (!f) return stbi__errpuc("can't fopen", "Unable to open file"); + result = stbi_load_from_file(f,x,y,comp,req_comp); + fclose(f); + return result; +} + +STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp) +{ + unsigned char *result; + stbi__context s; + stbi__start_file(&s,f); + result = stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp); + if (result) { + // need to 'unget' all the characters in the IO buffer + fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR); + } + return result; +} + +STBIDEF stbi__uint16 *stbi_load_from_file_16(FILE *f, int *x, int *y, int *comp, int req_comp) +{ + stbi__uint16 *result; + stbi__context s; + stbi__start_file(&s,f); + result = stbi__load_and_postprocess_16bit(&s,x,y,comp,req_comp); + if (result) { + // need to 'unget' all the characters in the IO buffer + fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR); + } + return result; +} + +STBIDEF stbi_us *stbi_load_16(char const *filename, int *x, int *y, int *comp, int req_comp) +{ + FILE *f = stbi__fopen(filename, "rb"); + stbi__uint16 *result; + if (!f) return (stbi_us *) stbi__errpuc("can't fopen", "Unable to open file"); + result = stbi_load_from_file_16(f,x,y,comp,req_comp); + fclose(f); + return result; +} + + +#endif //!STBI_NO_STDIO + +STBIDEF stbi_us *stbi_load_16_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels) +{ + stbi__context s; + stbi__start_mem(&s,buffer,len); + return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels); +} + +STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels) +{ + stbi__context s; + stbi__start_callbacks(&s, (stbi_io_callbacks *)clbk, user); + return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels); +} + +STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp) +{ + stbi__context s; + stbi__start_mem(&s,buffer,len); + return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp); +} + +STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp) +{ + stbi__context s; + stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user); + return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp); +} + +#ifndef STBI_NO_GIF +STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp) +{ + unsigned char *result; + stbi__context s; + stbi__start_mem(&s,buffer,len); + + result = (unsigned char*) stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp); + if (stbi__vertically_flip_on_load) { + stbi__vertical_flip_slices( result, *x, *y, *z, *comp ); + } + + return result; +} +#endif + +#ifndef STBI_NO_LINEAR +static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp) +{ + unsigned char *data; + #ifndef STBI_NO_HDR + if (stbi__hdr_test(s)) { + stbi__result_info ri; + float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp, &ri); + if (hdr_data) + stbi__float_postprocess(hdr_data,x,y,comp,req_comp); + return hdr_data; + } + #endif + data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp); + if (data) + return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp); + return stbi__errpf("unknown image type", "Image not of any known type, or corrupt"); +} + +STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp) +{ + stbi__context s; + stbi__start_mem(&s,buffer,len); + return stbi__loadf_main(&s,x,y,comp,req_comp); +} + +STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp) +{ + stbi__context s; + stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user); + return stbi__loadf_main(&s,x,y,comp,req_comp); +} + +#ifndef STBI_NO_STDIO +STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp) +{ + float *result; + FILE *f = stbi__fopen(filename, "rb"); + if (!f) return stbi__errpf("can't fopen", "Unable to open file"); + result = stbi_loadf_from_file(f,x,y,comp,req_comp); + fclose(f); + return result; +} + +STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp) +{ + stbi__context s; + stbi__start_file(&s,f); + return stbi__loadf_main(&s,x,y,comp,req_comp); +} +#endif // !STBI_NO_STDIO + +#endif // !STBI_NO_LINEAR + +// these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is +// defined, for API simplicity; if STBI_NO_LINEAR is defined, it always +// reports false! + +STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len) +{ + #ifndef STBI_NO_HDR + stbi__context s; + stbi__start_mem(&s,buffer,len); + return stbi__hdr_test(&s); + #else + STBI_NOTUSED(buffer); + STBI_NOTUSED(len); + return 0; + #endif +} + +#ifndef STBI_NO_STDIO +STBIDEF int stbi_is_hdr (char const *filename) +{ + FILE *f = stbi__fopen(filename, "rb"); + int result=0; + if (f) { + result = stbi_is_hdr_from_file(f); + fclose(f); + } + return result; +} + +STBIDEF int stbi_is_hdr_from_file(FILE *f) +{ + #ifndef STBI_NO_HDR + long pos = ftell(f); + int res; + stbi__context s; + stbi__start_file(&s,f); + res = stbi__hdr_test(&s); + fseek(f, pos, SEEK_SET); + return res; + #else + STBI_NOTUSED(f); + return 0; + #endif +} +#endif // !STBI_NO_STDIO + +STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user) +{ + #ifndef STBI_NO_HDR + stbi__context s; + stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user); + return stbi__hdr_test(&s); + #else + STBI_NOTUSED(clbk); + STBI_NOTUSED(user); + return 0; + #endif +} + +#ifndef STBI_NO_LINEAR +static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f; + +STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; } +STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; } +#endif + +static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f; + +STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; } +STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; } + + +////////////////////////////////////////////////////////////////////////////// +// +// Common code used by all image loaders +// + +enum +{ + STBI__SCAN_load=0, + STBI__SCAN_type, + STBI__SCAN_header +}; + +static void stbi__refill_buffer(stbi__context *s) +{ + int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen); + s->callback_already_read += (int) (s->img_buffer - s->img_buffer_original); + if (n == 0) { + // at end of file, treat same as if from memory, but need to handle case + // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file + s->read_from_callbacks = 0; + s->img_buffer = s->buffer_start; + s->img_buffer_end = s->buffer_start+1; + *s->img_buffer = 0; + } else { + s->img_buffer = s->buffer_start; + s->img_buffer_end = s->buffer_start + n; + } +} + +stbi_inline static stbi_uc stbi__get8(stbi__context *s) +{ + if (s->img_buffer < s->img_buffer_end) + return *s->img_buffer++; + if (s->read_from_callbacks) { + stbi__refill_buffer(s); + return *s->img_buffer++; + } + return 0; +} + +#if defined(STBI_NO_JPEG) && defined(STBI_NO_HDR) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM) +// nothing +#else +stbi_inline static int stbi__at_eof(stbi__context *s) +{ + if (s->io.read) { + if (!(s->io.eof)(s->io_user_data)) return 0; + // if feof() is true, check if buffer = end + // special case: we've only got the special 0 character at the end + if (s->read_from_callbacks == 0) return 1; + } + + return s->img_buffer >= s->img_buffer_end; +} +#endif + +#if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) +// nothing +#else +static void stbi__skip(stbi__context *s, int n) +{ + if (n == 0) return; // already there! + if (n < 0) { + s->img_buffer = s->img_buffer_end; + return; + } + if (s->io.read) { + int blen = (int) (s->img_buffer_end - s->img_buffer); + if (blen < n) { + s->img_buffer = s->img_buffer_end; + (s->io.skip)(s->io_user_data, n - blen); + return; + } + } + s->img_buffer += n; +} +#endif + +#if defined(STBI_NO_PNG) && defined(STBI_NO_TGA) && defined(STBI_NO_HDR) && defined(STBI_NO_PNM) +// nothing +#else +static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n) +{ + if (s->io.read) { + int blen = (int) (s->img_buffer_end - s->img_buffer); + if (blen < n) { + int res, count; + + memcpy(buffer, s->img_buffer, blen); + + count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen); + res = (count == (n-blen)); + s->img_buffer = s->img_buffer_end; + return res; + } + } + + if (s->img_buffer+n <= s->img_buffer_end) { + memcpy(buffer, s->img_buffer, n); + s->img_buffer += n; + return 1; + } else + return 0; +} +#endif + +#if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC) +// nothing +#else +static int stbi__get16be(stbi__context *s) +{ + int z = stbi__get8(s); + return (z << 8) + stbi__get8(s); +} +#endif + +#if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC) +// nothing +#else +static stbi__uint32 stbi__get32be(stbi__context *s) +{ + stbi__uint32 z = stbi__get16be(s); + return (z << 16) + stbi__get16be(s); +} +#endif + +#if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) +// nothing +#else +static int stbi__get16le(stbi__context *s) +{ + int z = stbi__get8(s); + return z + (stbi__get8(s) << 8); +} +#endif + +#ifndef STBI_NO_BMP +static stbi__uint32 stbi__get32le(stbi__context *s) +{ + stbi__uint32 z = stbi__get16le(s); + z += (stbi__uint32)stbi__get16le(s) << 16; + return z; +} +#endif + +#define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings + +#if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM) +// nothing +#else +////////////////////////////////////////////////////////////////////////////// +// +// generic converter from built-in img_n to req_comp +// individual types do this automatically as much as possible (e.g. jpeg +// does all cases internally since it needs to colorspace convert anyway, +// and it never has alpha, so very few cases ). png can automatically +// interleave an alpha=255 channel, but falls back to this for other cases +// +// assume data buffer is malloced, so malloc a new one and free that one +// only failure mode is malloc failing + +static stbi_uc stbi__compute_y(int r, int g, int b) +{ + return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8); +} +#endif + +#if defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM) +// nothing +#else +static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y) +{ + int i,j; + unsigned char *good; + + if (req_comp == img_n) return data; + STBI_ASSERT(req_comp >= 1 && req_comp <= 4); + + good = (unsigned char *) stbi__malloc_mad3(req_comp, x, y, 0); + if (good == NULL) { + STBI_FREE(data); + return stbi__errpuc("outofmem", "Out of memory"); + } + + for (j=0; j < (int) y; ++j) { + unsigned char *src = data + j * x * img_n ; + unsigned char *dest = good + j * x * req_comp; + + #define STBI__COMBO(a,b) ((a)*8+(b)) + #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b) + // convert source image with img_n components to one with req_comp components; + // avoid switch per pixel, so use switch per scanline and massive macros + switch (STBI__COMBO(img_n, req_comp)) { + STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=255; } break; + STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0]; } break; + STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=255; } break; + STBI__CASE(2,1) { dest[0]=src[0]; } break; + STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0]; } break; + STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1]; } break; + STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=255; } break; + STBI__CASE(3,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); } break; + STBI__CASE(3,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = 255; } break; + STBI__CASE(4,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); } break; + STBI__CASE(4,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = src[3]; } break; + STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2]; } break; + default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return stbi__errpuc("unsupported", "Unsupported format conversion"); + } + #undef STBI__CASE + } + + STBI_FREE(data); + return good; +} +#endif + +#if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) +// nothing +#else +static stbi__uint16 stbi__compute_y_16(int r, int g, int b) +{ + return (stbi__uint16) (((r*77) + (g*150) + (29*b)) >> 8); +} +#endif + +#if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) +// nothing +#else +static stbi__uint16 *stbi__convert_format16(stbi__uint16 *data, int img_n, int req_comp, unsigned int x, unsigned int y) +{ + int i,j; + stbi__uint16 *good; + + if (req_comp == img_n) return data; + STBI_ASSERT(req_comp >= 1 && req_comp <= 4); + + good = (stbi__uint16 *) stbi__malloc(req_comp * x * y * 2); + if (good == NULL) { + STBI_FREE(data); + return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory"); + } + + for (j=0; j < (int) y; ++j) { + stbi__uint16 *src = data + j * x * img_n ; + stbi__uint16 *dest = good + j * x * req_comp; + + #define STBI__COMBO(a,b) ((a)*8+(b)) + #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b) + // convert source image with img_n components to one with req_comp components; + // avoid switch per pixel, so use switch per scanline and massive macros + switch (STBI__COMBO(img_n, req_comp)) { + STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=0xffff; } break; + STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0]; } break; + STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=0xffff; } break; + STBI__CASE(2,1) { dest[0]=src[0]; } break; + STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0]; } break; + STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1]; } break; + STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=0xffff; } break; + STBI__CASE(3,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); } break; + STBI__CASE(3,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = 0xffff; } break; + STBI__CASE(4,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); } break; + STBI__CASE(4,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = src[3]; } break; + STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2]; } break; + default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return (stbi__uint16*) stbi__errpuc("unsupported", "Unsupported format conversion"); + } + #undef STBI__CASE + } + + STBI_FREE(data); + return good; +} +#endif + +#ifndef STBI_NO_LINEAR +static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp) +{ + int i,k,n; + float *output; + if (!data) return NULL; + output = (float *) stbi__malloc_mad4(x, y, comp, sizeof(float), 0); + if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); } + // compute number of non-alpha components + if (comp & 1) n = comp; else n = comp-1; + for (i=0; i < x*y; ++i) { + for (k=0; k < n; ++k) { + output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale); + } + } + if (n < comp) { + for (i=0; i < x*y; ++i) { + output[i*comp + n] = data[i*comp + n]/255.0f; + } + } + STBI_FREE(data); + return output; +} +#endif + +#ifndef STBI_NO_HDR +#define stbi__float2int(x) ((int) (x)) +static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp) +{ + int i,k,n; + stbi_uc *output; + if (!data) return NULL; + output = (stbi_uc *) stbi__malloc_mad3(x, y, comp, 0); + if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); } + // compute number of non-alpha components + if (comp & 1) n = comp; else n = comp-1; + for (i=0; i < x*y; ++i) { + for (k=0; k < n; ++k) { + float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f; + if (z < 0) z = 0; + if (z > 255) z = 255; + output[i*comp + k] = (stbi_uc) stbi__float2int(z); + } + if (k < comp) { + float z = data[i*comp+k] * 255 + 0.5f; + if (z < 0) z = 0; + if (z > 255) z = 255; + output[i*comp + k] = (stbi_uc) stbi__float2int(z); + } + } + STBI_FREE(data); + return output; +} +#endif + +////////////////////////////////////////////////////////////////////////////// +// +// "baseline" JPEG/JFIF decoder +// +// simple implementation +// - doesn't support delayed output of y-dimension +// - simple interface (only one output format: 8-bit interleaved RGB) +// - doesn't try to recover corrupt jpegs +// - doesn't allow partial loading, loading multiple at once +// - still fast on x86 (copying globals into locals doesn't help x86) +// - allocates lots of intermediate memory (full size of all components) +// - non-interleaved case requires this anyway +// - allows good upsampling (see next) +// high-quality +// - upsampled channels are bilinearly interpolated, even across blocks +// - quality integer IDCT derived from IJG's 'slow' +// performance +// - fast huffman; reasonable integer IDCT +// - some SIMD kernels for common paths on targets with SSE2/NEON +// - uses a lot of intermediate memory, could cache poorly + +#ifndef STBI_NO_JPEG + +// huffman decoding acceleration +#define FAST_BITS 9 // larger handles more cases; smaller stomps less cache + +typedef struct +{ + stbi_uc fast[1 << FAST_BITS]; + // weirdly, repacking this into AoS is a 10% speed loss, instead of a win + stbi__uint16 code[256]; + stbi_uc values[256]; + stbi_uc size[257]; + unsigned int maxcode[18]; + int delta[17]; // old 'firstsymbol' - old 'firstcode' +} stbi__huffman; + +typedef struct +{ + stbi__context *s; + stbi__huffman huff_dc[4]; + stbi__huffman huff_ac[4]; + stbi__uint16 dequant[4][64]; + stbi__int16 fast_ac[4][1 << FAST_BITS]; + +// sizes for components, interleaved MCUs + int img_h_max, img_v_max; + int img_mcu_x, img_mcu_y; + int img_mcu_w, img_mcu_h; + +// definition of jpeg image component + struct + { + int id; + int h,v; + int tq; + int hd,ha; + int dc_pred; + + int x,y,w2,h2; + stbi_uc *data; + void *raw_data, *raw_coeff; + stbi_uc *linebuf; + short *coeff; // progressive only + int coeff_w, coeff_h; // number of 8x8 coefficient blocks + } img_comp[4]; + + stbi__uint32 code_buffer; // jpeg entropy-coded buffer + int code_bits; // number of valid bits + unsigned char marker; // marker seen while filling entropy buffer + int nomore; // flag if we saw a marker so must stop + + int progressive; + int spec_start; + int spec_end; + int succ_high; + int succ_low; + int eob_run; + int jfif; + int app14_color_transform; // Adobe APP14 tag + int rgb; + + int scan_n, order[4]; + int restart_interval, todo; + +// kernels + void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]); + void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step); + stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs); +} stbi__jpeg; + +static int stbi__build_huffman(stbi__huffman *h, int *count) +{ + int i,j,k=0; + unsigned int code; + // build size list for each symbol (from JPEG spec) + for (i=0; i < 16; ++i) { + for (j=0; j < count[i]; ++j) { + h->size[k++] = (stbi_uc) (i+1); + if(k >= 257) return stbi__err("bad size list","Corrupt JPEG"); + } + } + h->size[k] = 0; + + // compute actual symbols (from jpeg spec) + code = 0; + k = 0; + for(j=1; j <= 16; ++j) { + // compute delta to add to code to compute symbol id + h->delta[j] = k - code; + if (h->size[k] == j) { + while (h->size[k] == j) + h->code[k++] = (stbi__uint16) (code++); + if (code-1 >= (1u << j)) return stbi__err("bad code lengths","Corrupt JPEG"); + } + // compute largest code + 1 for this size, preshifted as needed later + h->maxcode[j] = code << (16-j); + code <<= 1; + } + h->maxcode[j] = 0xffffffff; + + // build non-spec acceleration table; 255 is flag for not-accelerated + memset(h->fast, 255, 1 << FAST_BITS); + for (i=0; i < k; ++i) { + int s = h->size[i]; + if (s <= FAST_BITS) { + int c = h->code[i] << (FAST_BITS-s); + int m = 1 << (FAST_BITS-s); + for (j=0; j < m; ++j) { + h->fast[c+j] = (stbi_uc) i; + } + } + } + return 1; +} + +// build a table that decodes both magnitude and value of small ACs in +// one go. +static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h) +{ + int i; + for (i=0; i < (1 << FAST_BITS); ++i) { + stbi_uc fast = h->fast[i]; + fast_ac[i] = 0; + if (fast < 255) { + int rs = h->values[fast]; + int run = (rs >> 4) & 15; + int magbits = rs & 15; + int len = h->size[fast]; + + if (magbits && len + magbits <= FAST_BITS) { + // magnitude code followed by receive_extend code + int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits); + int m = 1 << (magbits - 1); + if (k < m) k += (~0U << magbits) + 1; + // if the result is small enough, we can fit it in fast_ac table + if (k >= -128 && k <= 127) + fast_ac[i] = (stbi__int16) ((k * 256) + (run * 16) + (len + magbits)); + } + } + } +} + +static void stbi__grow_buffer_unsafe(stbi__jpeg *j) +{ + do { + unsigned int b = j->nomore ? 0 : stbi__get8(j->s); + if (b == 0xff) { + int c = stbi__get8(j->s); + while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes + if (c != 0) { + j->marker = (unsigned char) c; + j->nomore = 1; + return; + } + } + j->code_buffer |= b << (24 - j->code_bits); + j->code_bits += 8; + } while (j->code_bits <= 24); +} + +// (1 << n) - 1 +static const stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535}; + +// decode a jpeg huffman value from the bitstream +stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h) +{ + unsigned int temp; + int c,k; + + if (j->code_bits < 16) stbi__grow_buffer_unsafe(j); + + // look at the top FAST_BITS and determine what symbol ID it is, + // if the code is <= FAST_BITS + c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1); + k = h->fast[c]; + if (k < 255) { + int s = h->size[k]; + if (s > j->code_bits) + return -1; + j->code_buffer <<= s; + j->code_bits -= s; + return h->values[k]; + } + + // naive test is to shift the code_buffer down so k bits are + // valid, then test against maxcode. To speed this up, we've + // preshifted maxcode left so that it has (16-k) 0s at the + // end; in other words, regardless of the number of bits, it + // wants to be compared against something shifted to have 16; + // that way we don't need to shift inside the loop. + temp = j->code_buffer >> 16; + for (k=FAST_BITS+1 ; ; ++k) + if (temp < h->maxcode[k]) + break; + if (k == 17) { + // error! code not found + j->code_bits -= 16; + return -1; + } + + if (k > j->code_bits) + return -1; + + // convert the huffman code to the symbol id + c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k]; + if(c < 0 || c >= 256) // symbol id out of bounds! + return -1; + STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]); + + // convert the id to a symbol + j->code_bits -= k; + j->code_buffer <<= k; + return h->values[c]; +} + +// bias[n] = (-1<code_bits < n) stbi__grow_buffer_unsafe(j); + if (j->code_bits < n) return 0; // ran out of bits from stream, return 0s intead of continuing + + sgn = j->code_buffer >> 31; // sign bit always in MSB; 0 if MSB clear (positive), 1 if MSB set (negative) + k = stbi_lrot(j->code_buffer, n); + j->code_buffer = k & ~stbi__bmask[n]; + k &= stbi__bmask[n]; + j->code_bits -= n; + return k + (stbi__jbias[n] & (sgn - 1)); +} + +// get some unsigned bits +stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n) +{ + unsigned int k; + if (j->code_bits < n) stbi__grow_buffer_unsafe(j); + if (j->code_bits < n) return 0; // ran out of bits from stream, return 0s intead of continuing + k = stbi_lrot(j->code_buffer, n); + j->code_buffer = k & ~stbi__bmask[n]; + k &= stbi__bmask[n]; + j->code_bits -= n; + return k; +} + +stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j) +{ + unsigned int k; + if (j->code_bits < 1) stbi__grow_buffer_unsafe(j); + if (j->code_bits < 1) return 0; // ran out of bits from stream, return 0s intead of continuing + k = j->code_buffer; + j->code_buffer <<= 1; + --j->code_bits; + return k & 0x80000000; +} + +// given a value that's at position X in the zigzag stream, +// where does it appear in the 8x8 matrix coded as row-major? +static const stbi_uc stbi__jpeg_dezigzag[64+15] = +{ + 0, 1, 8, 16, 9, 2, 3, 10, + 17, 24, 32, 25, 18, 11, 4, 5, + 12, 19, 26, 33, 40, 48, 41, 34, + 27, 20, 13, 6, 7, 14, 21, 28, + 35, 42, 49, 56, 57, 50, 43, 36, + 29, 22, 15, 23, 30, 37, 44, 51, + 58, 59, 52, 45, 38, 31, 39, 46, + 53, 60, 61, 54, 47, 55, 62, 63, + // let corrupt input sample past end + 63, 63, 63, 63, 63, 63, 63, 63, + 63, 63, 63, 63, 63, 63, 63 +}; + +// decode one 64-entry block-- +static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi__uint16 *dequant) +{ + int diff,dc,k; + int t; + + if (j->code_bits < 16) stbi__grow_buffer_unsafe(j); + t = stbi__jpeg_huff_decode(j, hdc); + if (t < 0 || t > 15) return stbi__err("bad huffman code","Corrupt JPEG"); + + // 0 all the ac values now so we can do it 32-bits at a time + memset(data,0,64*sizeof(data[0])); + + diff = t ? stbi__extend_receive(j, t) : 0; + if (!stbi__addints_valid(j->img_comp[b].dc_pred, diff)) return stbi__err("bad delta","Corrupt JPEG"); + dc = j->img_comp[b].dc_pred + diff; + j->img_comp[b].dc_pred = dc; + if (!stbi__mul2shorts_valid(dc, dequant[0])) return stbi__err("can't merge dc and ac", "Corrupt JPEG"); + data[0] = (short) (dc * dequant[0]); + + // decode AC components, see JPEG spec + k = 1; + do { + unsigned int zig; + int c,r,s; + if (j->code_bits < 16) stbi__grow_buffer_unsafe(j); + c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1); + r = fac[c]; + if (r) { // fast-AC path + k += (r >> 4) & 15; // run + s = r & 15; // combined length + if (s > j->code_bits) return stbi__err("bad huffman code", "Combined length longer than code bits available"); + j->code_buffer <<= s; + j->code_bits -= s; + // decode into unzigzag'd location + zig = stbi__jpeg_dezigzag[k++]; + data[zig] = (short) ((r >> 8) * dequant[zig]); + } else { + int rs = stbi__jpeg_huff_decode(j, hac); + if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG"); + s = rs & 15; + r = rs >> 4; + if (s == 0) { + if (rs != 0xf0) break; // end block + k += 16; + } else { + k += r; + // decode into unzigzag'd location + zig = stbi__jpeg_dezigzag[k++]; + data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]); + } + } + } while (k < 64); + return 1; +} + +static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b) +{ + int diff,dc; + int t; + if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG"); + + if (j->code_bits < 16) stbi__grow_buffer_unsafe(j); + + if (j->succ_high == 0) { + // first scan for DC coefficient, must be first + memset(data,0,64*sizeof(data[0])); // 0 all the ac values now + t = stbi__jpeg_huff_decode(j, hdc); + if (t < 0 || t > 15) return stbi__err("can't merge dc and ac", "Corrupt JPEG"); + diff = t ? stbi__extend_receive(j, t) : 0; + + if (!stbi__addints_valid(j->img_comp[b].dc_pred, diff)) return stbi__err("bad delta", "Corrupt JPEG"); + dc = j->img_comp[b].dc_pred + diff; + j->img_comp[b].dc_pred = dc; + if (!stbi__mul2shorts_valid(dc, 1 << j->succ_low)) return stbi__err("can't merge dc and ac", "Corrupt JPEG"); + data[0] = (short) (dc * (1 << j->succ_low)); + } else { + // refinement scan for DC coefficient + if (stbi__jpeg_get_bit(j)) + data[0] += (short) (1 << j->succ_low); + } + return 1; +} + +// @OPTIMIZE: store non-zigzagged during the decode passes, +// and only de-zigzag when dequantizing +static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac) +{ + int k; + if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG"); + + if (j->succ_high == 0) { + int shift = j->succ_low; + + if (j->eob_run) { + --j->eob_run; + return 1; + } + + k = j->spec_start; + do { + unsigned int zig; + int c,r,s; + if (j->code_bits < 16) stbi__grow_buffer_unsafe(j); + c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1); + r = fac[c]; + if (r) { // fast-AC path + k += (r >> 4) & 15; // run + s = r & 15; // combined length + if (s > j->code_bits) return stbi__err("bad huffman code", "Combined length longer than code bits available"); + j->code_buffer <<= s; + j->code_bits -= s; + zig = stbi__jpeg_dezigzag[k++]; + data[zig] = (short) ((r >> 8) * (1 << shift)); + } else { + int rs = stbi__jpeg_huff_decode(j, hac); + if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG"); + s = rs & 15; + r = rs >> 4; + if (s == 0) { + if (r < 15) { + j->eob_run = (1 << r); + if (r) + j->eob_run += stbi__jpeg_get_bits(j, r); + --j->eob_run; + break; + } + k += 16; + } else { + k += r; + zig = stbi__jpeg_dezigzag[k++]; + data[zig] = (short) (stbi__extend_receive(j,s) * (1 << shift)); + } + } + } while (k <= j->spec_end); + } else { + // refinement scan for these AC coefficients + + short bit = (short) (1 << j->succ_low); + + if (j->eob_run) { + --j->eob_run; + for (k = j->spec_start; k <= j->spec_end; ++k) { + short *p = &data[stbi__jpeg_dezigzag[k]]; + if (*p != 0) + if (stbi__jpeg_get_bit(j)) + if ((*p & bit)==0) { + if (*p > 0) + *p += bit; + else + *p -= bit; + } + } + } else { + k = j->spec_start; + do { + int r,s; + int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh + if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG"); + s = rs & 15; + r = rs >> 4; + if (s == 0) { + if (r < 15) { + j->eob_run = (1 << r) - 1; + if (r) + j->eob_run += stbi__jpeg_get_bits(j, r); + r = 64; // force end of block + } else { + // r=15 s=0 should write 16 0s, so we just do + // a run of 15 0s and then write s (which is 0), + // so we don't have to do anything special here + } + } else { + if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG"); + // sign bit + if (stbi__jpeg_get_bit(j)) + s = bit; + else + s = -bit; + } + + // advance by r + while (k <= j->spec_end) { + short *p = &data[stbi__jpeg_dezigzag[k++]]; + if (*p != 0) { + if (stbi__jpeg_get_bit(j)) + if ((*p & bit)==0) { + if (*p > 0) + *p += bit; + else + *p -= bit; + } + } else { + if (r == 0) { + *p = (short) s; + break; + } + --r; + } + } + } while (k <= j->spec_end); + } + } + return 1; +} + +// take a -128..127 value and stbi__clamp it and convert to 0..255 +stbi_inline static stbi_uc stbi__clamp(int x) +{ + // trick to use a single test to catch both cases + if ((unsigned int) x > 255) { + if (x < 0) return 0; + if (x > 255) return 255; + } + return (stbi_uc) x; +} + +#define stbi__f2f(x) ((int) (((x) * 4096 + 0.5))) +#define stbi__fsh(x) ((x) * 4096) + +// derived from jidctint -- DCT_ISLOW +#define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \ + int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \ + p2 = s2; \ + p3 = s6; \ + p1 = (p2+p3) * stbi__f2f(0.5411961f); \ + t2 = p1 + p3*stbi__f2f(-1.847759065f); \ + t3 = p1 + p2*stbi__f2f( 0.765366865f); \ + p2 = s0; \ + p3 = s4; \ + t0 = stbi__fsh(p2+p3); \ + t1 = stbi__fsh(p2-p3); \ + x0 = t0+t3; \ + x3 = t0-t3; \ + x1 = t1+t2; \ + x2 = t1-t2; \ + t0 = s7; \ + t1 = s5; \ + t2 = s3; \ + t3 = s1; \ + p3 = t0+t2; \ + p4 = t1+t3; \ + p1 = t0+t3; \ + p2 = t1+t2; \ + p5 = (p3+p4)*stbi__f2f( 1.175875602f); \ + t0 = t0*stbi__f2f( 0.298631336f); \ + t1 = t1*stbi__f2f( 2.053119869f); \ + t2 = t2*stbi__f2f( 3.072711026f); \ + t3 = t3*stbi__f2f( 1.501321110f); \ + p1 = p5 + p1*stbi__f2f(-0.899976223f); \ + p2 = p5 + p2*stbi__f2f(-2.562915447f); \ + p3 = p3*stbi__f2f(-1.961570560f); \ + p4 = p4*stbi__f2f(-0.390180644f); \ + t3 += p1+p4; \ + t2 += p2+p3; \ + t1 += p2+p4; \ + t0 += p1+p3; + +static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64]) +{ + int i,val[64],*v=val; + stbi_uc *o; + short *d = data; + + // columns + for (i=0; i < 8; ++i,++d, ++v) { + // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing + if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0 + && d[40]==0 && d[48]==0 && d[56]==0) { + // no shortcut 0 seconds + // (1|2|3|4|5|6|7)==0 0 seconds + // all separate -0.047 seconds + // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds + int dcterm = d[0]*4; + v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm; + } else { + STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56]) + // constants scaled things up by 1<<12; let's bring them back + // down, but keep 2 extra bits of precision + x0 += 512; x1 += 512; x2 += 512; x3 += 512; + v[ 0] = (x0+t3) >> 10; + v[56] = (x0-t3) >> 10; + v[ 8] = (x1+t2) >> 10; + v[48] = (x1-t2) >> 10; + v[16] = (x2+t1) >> 10; + v[40] = (x2-t1) >> 10; + v[24] = (x3+t0) >> 10; + v[32] = (x3-t0) >> 10; + } + } + + for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) { + // no fast case since the first 1D IDCT spread components out + STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7]) + // constants scaled things up by 1<<12, plus we had 1<<2 from first + // loop, plus horizontal and vertical each scale by sqrt(8) so together + // we've got an extra 1<<3, so 1<<17 total we need to remove. + // so we want to round that, which means adding 0.5 * 1<<17, + // aka 65536. Also, we'll end up with -128 to 127 that we want + // to encode as 0..255 by adding 128, so we'll add that before the shift + x0 += 65536 + (128<<17); + x1 += 65536 + (128<<17); + x2 += 65536 + (128<<17); + x3 += 65536 + (128<<17); + // tried computing the shifts into temps, or'ing the temps to see + // if any were out of range, but that was slower + o[0] = stbi__clamp((x0+t3) >> 17); + o[7] = stbi__clamp((x0-t3) >> 17); + o[1] = stbi__clamp((x1+t2) >> 17); + o[6] = stbi__clamp((x1-t2) >> 17); + o[2] = stbi__clamp((x2+t1) >> 17); + o[5] = stbi__clamp((x2-t1) >> 17); + o[3] = stbi__clamp((x3+t0) >> 17); + o[4] = stbi__clamp((x3-t0) >> 17); + } +} + +#ifdef STBI_SSE2 +// sse2 integer IDCT. not the fastest possible implementation but it +// produces bit-identical results to the generic C version so it's +// fully "transparent". +static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64]) +{ + // This is constructed to match our regular (generic) integer IDCT exactly. + __m128i row0, row1, row2, row3, row4, row5, row6, row7; + __m128i tmp; + + // dot product constant: even elems=x, odd elems=y + #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y)) + + // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit) + // out(1) = c1[even]*x + c1[odd]*y + #define dct_rot(out0,out1, x,y,c0,c1) \ + __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \ + __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \ + __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \ + __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \ + __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \ + __m128i out1##_h = _mm_madd_epi16(c0##hi, c1) + + // out = in << 12 (in 16-bit, out 32-bit) + #define dct_widen(out, in) \ + __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \ + __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4) + + // wide add + #define dct_wadd(out, a, b) \ + __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \ + __m128i out##_h = _mm_add_epi32(a##_h, b##_h) + + // wide sub + #define dct_wsub(out, a, b) \ + __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \ + __m128i out##_h = _mm_sub_epi32(a##_h, b##_h) + + // butterfly a/b, add bias, then shift by "s" and pack + #define dct_bfly32o(out0, out1, a,b,bias,s) \ + { \ + __m128i abiased_l = _mm_add_epi32(a##_l, bias); \ + __m128i abiased_h = _mm_add_epi32(a##_h, bias); \ + dct_wadd(sum, abiased, b); \ + dct_wsub(dif, abiased, b); \ + out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \ + out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \ + } + + // 8-bit interleave step (for transposes) + #define dct_interleave8(a, b) \ + tmp = a; \ + a = _mm_unpacklo_epi8(a, b); \ + b = _mm_unpackhi_epi8(tmp, b) + + // 16-bit interleave step (for transposes) + #define dct_interleave16(a, b) \ + tmp = a; \ + a = _mm_unpacklo_epi16(a, b); \ + b = _mm_unpackhi_epi16(tmp, b) + + #define dct_pass(bias,shift) \ + { \ + /* even part */ \ + dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \ + __m128i sum04 = _mm_add_epi16(row0, row4); \ + __m128i dif04 = _mm_sub_epi16(row0, row4); \ + dct_widen(t0e, sum04); \ + dct_widen(t1e, dif04); \ + dct_wadd(x0, t0e, t3e); \ + dct_wsub(x3, t0e, t3e); \ + dct_wadd(x1, t1e, t2e); \ + dct_wsub(x2, t1e, t2e); \ + /* odd part */ \ + dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \ + dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \ + __m128i sum17 = _mm_add_epi16(row1, row7); \ + __m128i sum35 = _mm_add_epi16(row3, row5); \ + dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \ + dct_wadd(x4, y0o, y4o); \ + dct_wadd(x5, y1o, y5o); \ + dct_wadd(x6, y2o, y5o); \ + dct_wadd(x7, y3o, y4o); \ + dct_bfly32o(row0,row7, x0,x7,bias,shift); \ + dct_bfly32o(row1,row6, x1,x6,bias,shift); \ + dct_bfly32o(row2,row5, x2,x5,bias,shift); \ + dct_bfly32o(row3,row4, x3,x4,bias,shift); \ + } + + __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f)); + __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f)); + __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f)); + __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f)); + __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f)); + __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f)); + __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f)); + __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f)); + + // rounding biases in column/row passes, see stbi__idct_block for explanation. + __m128i bias_0 = _mm_set1_epi32(512); + __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17)); + + // load + row0 = _mm_load_si128((const __m128i *) (data + 0*8)); + row1 = _mm_load_si128((const __m128i *) (data + 1*8)); + row2 = _mm_load_si128((const __m128i *) (data + 2*8)); + row3 = _mm_load_si128((const __m128i *) (data + 3*8)); + row4 = _mm_load_si128((const __m128i *) (data + 4*8)); + row5 = _mm_load_si128((const __m128i *) (data + 5*8)); + row6 = _mm_load_si128((const __m128i *) (data + 6*8)); + row7 = _mm_load_si128((const __m128i *) (data + 7*8)); + + // column pass + dct_pass(bias_0, 10); + + { + // 16bit 8x8 transpose pass 1 + dct_interleave16(row0, row4); + dct_interleave16(row1, row5); + dct_interleave16(row2, row6); + dct_interleave16(row3, row7); + + // transpose pass 2 + dct_interleave16(row0, row2); + dct_interleave16(row1, row3); + dct_interleave16(row4, row6); + dct_interleave16(row5, row7); + + // transpose pass 3 + dct_interleave16(row0, row1); + dct_interleave16(row2, row3); + dct_interleave16(row4, row5); + dct_interleave16(row6, row7); + } + + // row pass + dct_pass(bias_1, 17); + + { + // pack + __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7 + __m128i p1 = _mm_packus_epi16(row2, row3); + __m128i p2 = _mm_packus_epi16(row4, row5); + __m128i p3 = _mm_packus_epi16(row6, row7); + + // 8bit 8x8 transpose pass 1 + dct_interleave8(p0, p2); // a0e0a1e1... + dct_interleave8(p1, p3); // c0g0c1g1... + + // transpose pass 2 + dct_interleave8(p0, p1); // a0c0e0g0... + dct_interleave8(p2, p3); // b0d0f0h0... + + // transpose pass 3 + dct_interleave8(p0, p2); // a0b0c0d0... + dct_interleave8(p1, p3); // a4b4c4d4... + + // store + _mm_storel_epi64((__m128i *) out, p0); out += out_stride; + _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride; + _mm_storel_epi64((__m128i *) out, p2); out += out_stride; + _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride; + _mm_storel_epi64((__m128i *) out, p1); out += out_stride; + _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride; + _mm_storel_epi64((__m128i *) out, p3); out += out_stride; + _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e)); + } + +#undef dct_const +#undef dct_rot +#undef dct_widen +#undef dct_wadd +#undef dct_wsub +#undef dct_bfly32o +#undef dct_interleave8 +#undef dct_interleave16 +#undef dct_pass +} + +#endif // STBI_SSE2 + +#ifdef STBI_NEON + +// NEON integer IDCT. should produce bit-identical +// results to the generic C version. +static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64]) +{ + int16x8_t row0, row1, row2, row3, row4, row5, row6, row7; + + int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f)); + int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f)); + int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f)); + int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f)); + int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f)); + int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f)); + int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f)); + int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f)); + int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f)); + int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f)); + int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f)); + int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f)); + +#define dct_long_mul(out, inq, coeff) \ + int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \ + int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff) + +#define dct_long_mac(out, acc, inq, coeff) \ + int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \ + int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff) + +#define dct_widen(out, inq) \ + int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \ + int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12) + +// wide add +#define dct_wadd(out, a, b) \ + int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \ + int32x4_t out##_h = vaddq_s32(a##_h, b##_h) + +// wide sub +#define dct_wsub(out, a, b) \ + int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \ + int32x4_t out##_h = vsubq_s32(a##_h, b##_h) + +// butterfly a/b, then shift using "shiftop" by "s" and pack +#define dct_bfly32o(out0,out1, a,b,shiftop,s) \ + { \ + dct_wadd(sum, a, b); \ + dct_wsub(dif, a, b); \ + out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \ + out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \ + } + +#define dct_pass(shiftop, shift) \ + { \ + /* even part */ \ + int16x8_t sum26 = vaddq_s16(row2, row6); \ + dct_long_mul(p1e, sum26, rot0_0); \ + dct_long_mac(t2e, p1e, row6, rot0_1); \ + dct_long_mac(t3e, p1e, row2, rot0_2); \ + int16x8_t sum04 = vaddq_s16(row0, row4); \ + int16x8_t dif04 = vsubq_s16(row0, row4); \ + dct_widen(t0e, sum04); \ + dct_widen(t1e, dif04); \ + dct_wadd(x0, t0e, t3e); \ + dct_wsub(x3, t0e, t3e); \ + dct_wadd(x1, t1e, t2e); \ + dct_wsub(x2, t1e, t2e); \ + /* odd part */ \ + int16x8_t sum15 = vaddq_s16(row1, row5); \ + int16x8_t sum17 = vaddq_s16(row1, row7); \ + int16x8_t sum35 = vaddq_s16(row3, row5); \ + int16x8_t sum37 = vaddq_s16(row3, row7); \ + int16x8_t sumodd = vaddq_s16(sum17, sum35); \ + dct_long_mul(p5o, sumodd, rot1_0); \ + dct_long_mac(p1o, p5o, sum17, rot1_1); \ + dct_long_mac(p2o, p5o, sum35, rot1_2); \ + dct_long_mul(p3o, sum37, rot2_0); \ + dct_long_mul(p4o, sum15, rot2_1); \ + dct_wadd(sump13o, p1o, p3o); \ + dct_wadd(sump24o, p2o, p4o); \ + dct_wadd(sump23o, p2o, p3o); \ + dct_wadd(sump14o, p1o, p4o); \ + dct_long_mac(x4, sump13o, row7, rot3_0); \ + dct_long_mac(x5, sump24o, row5, rot3_1); \ + dct_long_mac(x6, sump23o, row3, rot3_2); \ + dct_long_mac(x7, sump14o, row1, rot3_3); \ + dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \ + dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \ + dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \ + dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \ + } + + // load + row0 = vld1q_s16(data + 0*8); + row1 = vld1q_s16(data + 1*8); + row2 = vld1q_s16(data + 2*8); + row3 = vld1q_s16(data + 3*8); + row4 = vld1q_s16(data + 4*8); + row5 = vld1q_s16(data + 5*8); + row6 = vld1q_s16(data + 6*8); + row7 = vld1q_s16(data + 7*8); + + // add DC bias + row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0)); + + // column pass + dct_pass(vrshrn_n_s32, 10); + + // 16bit 8x8 transpose + { +// these three map to a single VTRN.16, VTRN.32, and VSWP, respectively. +// whether compilers actually get this is another story, sadly. +#define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; } +#define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); } +#define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); } + + // pass 1 + dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6 + dct_trn16(row2, row3); + dct_trn16(row4, row5); + dct_trn16(row6, row7); + + // pass 2 + dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4 + dct_trn32(row1, row3); + dct_trn32(row4, row6); + dct_trn32(row5, row7); + + // pass 3 + dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0 + dct_trn64(row1, row5); + dct_trn64(row2, row6); + dct_trn64(row3, row7); + +#undef dct_trn16 +#undef dct_trn32 +#undef dct_trn64 + } + + // row pass + // vrshrn_n_s32 only supports shifts up to 16, we need + // 17. so do a non-rounding shift of 16 first then follow + // up with a rounding shift by 1. + dct_pass(vshrn_n_s32, 16); + + { + // pack and round + uint8x8_t p0 = vqrshrun_n_s16(row0, 1); + uint8x8_t p1 = vqrshrun_n_s16(row1, 1); + uint8x8_t p2 = vqrshrun_n_s16(row2, 1); + uint8x8_t p3 = vqrshrun_n_s16(row3, 1); + uint8x8_t p4 = vqrshrun_n_s16(row4, 1); + uint8x8_t p5 = vqrshrun_n_s16(row5, 1); + uint8x8_t p6 = vqrshrun_n_s16(row6, 1); + uint8x8_t p7 = vqrshrun_n_s16(row7, 1); + + // again, these can translate into one instruction, but often don't. +#define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; } +#define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); } +#define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); } + + // sadly can't use interleaved stores here since we only write + // 8 bytes to each scan line! + + // 8x8 8-bit transpose pass 1 + dct_trn8_8(p0, p1); + dct_trn8_8(p2, p3); + dct_trn8_8(p4, p5); + dct_trn8_8(p6, p7); + + // pass 2 + dct_trn8_16(p0, p2); + dct_trn8_16(p1, p3); + dct_trn8_16(p4, p6); + dct_trn8_16(p5, p7); + + // pass 3 + dct_trn8_32(p0, p4); + dct_trn8_32(p1, p5); + dct_trn8_32(p2, p6); + dct_trn8_32(p3, p7); + + // store + vst1_u8(out, p0); out += out_stride; + vst1_u8(out, p1); out += out_stride; + vst1_u8(out, p2); out += out_stride; + vst1_u8(out, p3); out += out_stride; + vst1_u8(out, p4); out += out_stride; + vst1_u8(out, p5); out += out_stride; + vst1_u8(out, p6); out += out_stride; + vst1_u8(out, p7); + +#undef dct_trn8_8 +#undef dct_trn8_16 +#undef dct_trn8_32 + } + +#undef dct_long_mul +#undef dct_long_mac +#undef dct_widen +#undef dct_wadd +#undef dct_wsub +#undef dct_bfly32o +#undef dct_pass +} + +#endif // STBI_NEON + +#define STBI__MARKER_none 0xff +// if there's a pending marker from the entropy stream, return that +// otherwise, fetch from the stream and get a marker. if there's no +// marker, return 0xff, which is never a valid marker value +static stbi_uc stbi__get_marker(stbi__jpeg *j) +{ + stbi_uc x; + if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; } + x = stbi__get8(j->s); + if (x != 0xff) return STBI__MARKER_none; + while (x == 0xff) + x = stbi__get8(j->s); // consume repeated 0xff fill bytes + return x; +} + +// in each scan, we'll have scan_n components, and the order +// of the components is specified by order[] +#define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7) + +// after a restart interval, stbi__jpeg_reset the entropy decoder and +// the dc prediction +static void stbi__jpeg_reset(stbi__jpeg *j) +{ + j->code_bits = 0; + j->code_buffer = 0; + j->nomore = 0; + j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0; + j->marker = STBI__MARKER_none; + j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff; + j->eob_run = 0; + // no more than 1<<31 MCUs if no restart_interal? that's plenty safe, + // since we don't even allow 1<<30 pixels +} + +static int stbi__parse_entropy_coded_data(stbi__jpeg *z) +{ + stbi__jpeg_reset(z); + if (!z->progressive) { + if (z->scan_n == 1) { + int i,j; + STBI_SIMD_ALIGN(short, data[64]); + int n = z->order[0]; + // non-interleaved data, we just need to process one block at a time, + // in trivial scanline order + // number of blocks to do just depends on how many actual "pixels" this + // component has, independent of interleaved MCU blocking and such + int w = (z->img_comp[n].x+7) >> 3; + int h = (z->img_comp[n].y+7) >> 3; + for (j=0; j < h; ++j) { + for (i=0; i < w; ++i) { + int ha = z->img_comp[n].ha; + if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0; + z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data); + // every data block is an MCU, so countdown the restart interval + if (--z->todo <= 0) { + if (z->code_bits < 24) stbi__grow_buffer_unsafe(z); + // if it's NOT a restart, then just bail, so we get corrupt data + // rather than no data + if (!STBI__RESTART(z->marker)) return 1; + stbi__jpeg_reset(z); + } + } + } + return 1; + } else { // interleaved + int i,j,k,x,y; + STBI_SIMD_ALIGN(short, data[64]); + for (j=0; j < z->img_mcu_y; ++j) { + for (i=0; i < z->img_mcu_x; ++i) { + // scan an interleaved mcu... process scan_n components in order + for (k=0; k < z->scan_n; ++k) { + int n = z->order[k]; + // scan out an mcu's worth of this component; that's just determined + // by the basic H and V specified for the component + for (y=0; y < z->img_comp[n].v; ++y) { + for (x=0; x < z->img_comp[n].h; ++x) { + int x2 = (i*z->img_comp[n].h + x)*8; + int y2 = (j*z->img_comp[n].v + y)*8; + int ha = z->img_comp[n].ha; + if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0; + z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data); + } + } + } + // after all interleaved components, that's an interleaved MCU, + // so now count down the restart interval + if (--z->todo <= 0) { + if (z->code_bits < 24) stbi__grow_buffer_unsafe(z); + if (!STBI__RESTART(z->marker)) return 1; + stbi__jpeg_reset(z); + } + } + } + return 1; + } + } else { + if (z->scan_n == 1) { + int i,j; + int n = z->order[0]; + // non-interleaved data, we just need to process one block at a time, + // in trivial scanline order + // number of blocks to do just depends on how many actual "pixels" this + // component has, independent of interleaved MCU blocking and such + int w = (z->img_comp[n].x+7) >> 3; + int h = (z->img_comp[n].y+7) >> 3; + for (j=0; j < h; ++j) { + for (i=0; i < w; ++i) { + short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w); + if (z->spec_start == 0) { + if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n)) + return 0; + } else { + int ha = z->img_comp[n].ha; + if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha])) + return 0; + } + // every data block is an MCU, so countdown the restart interval + if (--z->todo <= 0) { + if (z->code_bits < 24) stbi__grow_buffer_unsafe(z); + if (!STBI__RESTART(z->marker)) return 1; + stbi__jpeg_reset(z); + } + } + } + return 1; + } else { // interleaved + int i,j,k,x,y; + for (j=0; j < z->img_mcu_y; ++j) { + for (i=0; i < z->img_mcu_x; ++i) { + // scan an interleaved mcu... process scan_n components in order + for (k=0; k < z->scan_n; ++k) { + int n = z->order[k]; + // scan out an mcu's worth of this component; that's just determined + // by the basic H and V specified for the component + for (y=0; y < z->img_comp[n].v; ++y) { + for (x=0; x < z->img_comp[n].h; ++x) { + int x2 = (i*z->img_comp[n].h + x); + int y2 = (j*z->img_comp[n].v + y); + short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w); + if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n)) + return 0; + } + } + } + // after all interleaved components, that's an interleaved MCU, + // so now count down the restart interval + if (--z->todo <= 0) { + if (z->code_bits < 24) stbi__grow_buffer_unsafe(z); + if (!STBI__RESTART(z->marker)) return 1; + stbi__jpeg_reset(z); + } + } + } + return 1; + } + } +} + +static void stbi__jpeg_dequantize(short *data, stbi__uint16 *dequant) +{ + int i; + for (i=0; i < 64; ++i) + data[i] *= dequant[i]; +} + +static void stbi__jpeg_finish(stbi__jpeg *z) +{ + if (z->progressive) { + // dequantize and idct the data + int i,j,n; + for (n=0; n < z->s->img_n; ++n) { + int w = (z->img_comp[n].x+7) >> 3; + int h = (z->img_comp[n].y+7) >> 3; + for (j=0; j < h; ++j) { + for (i=0; i < w; ++i) { + short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w); + stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]); + z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data); + } + } + } + } +} + +static int stbi__process_marker(stbi__jpeg *z, int m) +{ + int L; + switch (m) { + case STBI__MARKER_none: // no marker found + return stbi__err("expected marker","Corrupt JPEG"); + + case 0xDD: // DRI - specify restart interval + if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG"); + z->restart_interval = stbi__get16be(z->s); + return 1; + + case 0xDB: // DQT - define quantization table + L = stbi__get16be(z->s)-2; + while (L > 0) { + int q = stbi__get8(z->s); + int p = q >> 4, sixteen = (p != 0); + int t = q & 15,i; + if (p != 0 && p != 1) return stbi__err("bad DQT type","Corrupt JPEG"); + if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG"); + + for (i=0; i < 64; ++i) + z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s)); + L -= (sixteen ? 129 : 65); + } + return L==0; + + case 0xC4: // DHT - define huffman table + L = stbi__get16be(z->s)-2; + while (L > 0) { + stbi_uc *v; + int sizes[16],i,n=0; + int q = stbi__get8(z->s); + int tc = q >> 4; + int th = q & 15; + if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG"); + for (i=0; i < 16; ++i) { + sizes[i] = stbi__get8(z->s); + n += sizes[i]; + } + if(n > 256) return stbi__err("bad DHT header","Corrupt JPEG"); // Loop over i < n would write past end of values! + L -= 17; + if (tc == 0) { + if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0; + v = z->huff_dc[th].values; + } else { + if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0; + v = z->huff_ac[th].values; + } + for (i=0; i < n; ++i) + v[i] = stbi__get8(z->s); + if (tc != 0) + stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th); + L -= n; + } + return L==0; + } + + // check for comment block or APP blocks + if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) { + L = stbi__get16be(z->s); + if (L < 2) { + if (m == 0xFE) + return stbi__err("bad COM len","Corrupt JPEG"); + else + return stbi__err("bad APP len","Corrupt JPEG"); + } + L -= 2; + + if (m == 0xE0 && L >= 5) { // JFIF APP0 segment + static const unsigned char tag[5] = {'J','F','I','F','\0'}; + int ok = 1; + int i; + for (i=0; i < 5; ++i) + if (stbi__get8(z->s) != tag[i]) + ok = 0; + L -= 5; + if (ok) + z->jfif = 1; + } else if (m == 0xEE && L >= 12) { // Adobe APP14 segment + static const unsigned char tag[6] = {'A','d','o','b','e','\0'}; + int ok = 1; + int i; + for (i=0; i < 6; ++i) + if (stbi__get8(z->s) != tag[i]) + ok = 0; + L -= 6; + if (ok) { + stbi__get8(z->s); // version + stbi__get16be(z->s); // flags0 + stbi__get16be(z->s); // flags1 + z->app14_color_transform = stbi__get8(z->s); // color transform + L -= 6; + } + } + + stbi__skip(z->s, L); + return 1; + } + + return stbi__err("unknown marker","Corrupt JPEG"); +} + +// after we see SOS +static int stbi__process_scan_header(stbi__jpeg *z) +{ + int i; + int Ls = stbi__get16be(z->s); + z->scan_n = stbi__get8(z->s); + if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG"); + if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG"); + for (i=0; i < z->scan_n; ++i) { + int id = stbi__get8(z->s), which; + int q = stbi__get8(z->s); + for (which = 0; which < z->s->img_n; ++which) + if (z->img_comp[which].id == id) + break; + if (which == z->s->img_n) return 0; // no match + z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG"); + z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG"); + z->order[i] = which; + } + + { + int aa; + z->spec_start = stbi__get8(z->s); + z->spec_end = stbi__get8(z->s); // should be 63, but might be 0 + aa = stbi__get8(z->s); + z->succ_high = (aa >> 4); + z->succ_low = (aa & 15); + if (z->progressive) { + if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13) + return stbi__err("bad SOS", "Corrupt JPEG"); + } else { + if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG"); + if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG"); + z->spec_end = 63; + } + } + + return 1; +} + +static int stbi__free_jpeg_components(stbi__jpeg *z, int ncomp, int why) +{ + int i; + for (i=0; i < ncomp; ++i) { + if (z->img_comp[i].raw_data) { + STBI_FREE(z->img_comp[i].raw_data); + z->img_comp[i].raw_data = NULL; + z->img_comp[i].data = NULL; + } + if (z->img_comp[i].raw_coeff) { + STBI_FREE(z->img_comp[i].raw_coeff); + z->img_comp[i].raw_coeff = 0; + z->img_comp[i].coeff = 0; + } + if (z->img_comp[i].linebuf) { + STBI_FREE(z->img_comp[i].linebuf); + z->img_comp[i].linebuf = NULL; + } + } + return why; +} + +static int stbi__process_frame_header(stbi__jpeg *z, int scan) +{ + stbi__context *s = z->s; + int Lf,p,i,q, h_max=1,v_max=1,c; + Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG + p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline + s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG + s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires + if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)"); + if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)"); + c = stbi__get8(s); + if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count","Corrupt JPEG"); + s->img_n = c; + for (i=0; i < c; ++i) { + z->img_comp[i].data = NULL; + z->img_comp[i].linebuf = NULL; + } + + if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG"); + + z->rgb = 0; + for (i=0; i < s->img_n; ++i) { + static const unsigned char rgb[3] = { 'R', 'G', 'B' }; + z->img_comp[i].id = stbi__get8(s); + if (s->img_n == 3 && z->img_comp[i].id == rgb[i]) + ++z->rgb; + q = stbi__get8(s); + z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG"); + z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG"); + z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG"); + } + + if (scan != STBI__SCAN_load) return 1; + + if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode"); + + for (i=0; i < s->img_n; ++i) { + if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h; + if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v; + } + + // check that plane subsampling factors are integer ratios; our resamplers can't deal with fractional ratios + // and I've never seen a non-corrupted JPEG file actually use them + for (i=0; i < s->img_n; ++i) { + if (h_max % z->img_comp[i].h != 0) return stbi__err("bad H","Corrupt JPEG"); + if (v_max % z->img_comp[i].v != 0) return stbi__err("bad V","Corrupt JPEG"); + } + + // compute interleaved mcu info + z->img_h_max = h_max; + z->img_v_max = v_max; + z->img_mcu_w = h_max * 8; + z->img_mcu_h = v_max * 8; + // these sizes can't be more than 17 bits + z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w; + z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h; + + for (i=0; i < s->img_n; ++i) { + // number of effective pixels (e.g. for non-interleaved MCU) + z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max; + z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max; + // to simplify generation, we'll allocate enough memory to decode + // the bogus oversized data from using interleaved MCUs and their + // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't + // discard the extra data until colorspace conversion + // + // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier) + // so these muls can't overflow with 32-bit ints (which we require) + z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8; + z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8; + z->img_comp[i].coeff = 0; + z->img_comp[i].raw_coeff = 0; + z->img_comp[i].linebuf = NULL; + z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15); + if (z->img_comp[i].raw_data == NULL) + return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory")); + // align blocks for idct using mmx/sse + z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15); + if (z->progressive) { + // w2, h2 are multiples of 8 (see above) + z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8; + z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8; + z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15); + if (z->img_comp[i].raw_coeff == NULL) + return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory")); + z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15); + } + } + + return 1; +} + +// use comparisons since in some cases we handle more than one case (e.g. SOF) +#define stbi__DNL(x) ((x) == 0xdc) +#define stbi__SOI(x) ((x) == 0xd8) +#define stbi__EOI(x) ((x) == 0xd9) +#define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2) +#define stbi__SOS(x) ((x) == 0xda) + +#define stbi__SOF_progressive(x) ((x) == 0xc2) + +static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan) +{ + int m; + z->jfif = 0; + z->app14_color_transform = -1; // valid values are 0,1,2 + z->marker = STBI__MARKER_none; // initialize cached marker to empty + m = stbi__get_marker(z); + if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG"); + if (scan == STBI__SCAN_type) return 1; + m = stbi__get_marker(z); + while (!stbi__SOF(m)) { + if (!stbi__process_marker(z,m)) return 0; + m = stbi__get_marker(z); + while (m == STBI__MARKER_none) { + // some files have extra padding after their blocks, so ok, we'll scan + if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG"); + m = stbi__get_marker(z); + } + } + z->progressive = stbi__SOF_progressive(m); + if (!stbi__process_frame_header(z, scan)) return 0; + return 1; +} + +static int stbi__skip_jpeg_junk_at_end(stbi__jpeg *j) +{ + // some JPEGs have junk at end, skip over it but if we find what looks + // like a valid marker, resume there + while (!stbi__at_eof(j->s)) { + int x = stbi__get8(j->s); + while (x == 255) { // might be a marker + if (stbi__at_eof(j->s)) return STBI__MARKER_none; + x = stbi__get8(j->s); + if (x != 0x00 && x != 0xff) { + // not a stuffed zero or lead-in to another marker, looks + // like an actual marker, return it + return x; + } + // stuffed zero has x=0 now which ends the loop, meaning we go + // back to regular scan loop. + // repeated 0xff keeps trying to read the next byte of the marker. + } + } + return STBI__MARKER_none; +} + +// decode image to YCbCr format +static int stbi__decode_jpeg_image(stbi__jpeg *j) +{ + int m; + for (m = 0; m < 4; m++) { + j->img_comp[m].raw_data = NULL; + j->img_comp[m].raw_coeff = NULL; + } + j->restart_interval = 0; + if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0; + m = stbi__get_marker(j); + while (!stbi__EOI(m)) { + if (stbi__SOS(m)) { + if (!stbi__process_scan_header(j)) return 0; + if (!stbi__parse_entropy_coded_data(j)) return 0; + if (j->marker == STBI__MARKER_none ) { + j->marker = stbi__skip_jpeg_junk_at_end(j); + // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0 + } + m = stbi__get_marker(j); + if (STBI__RESTART(m)) + m = stbi__get_marker(j); + } else if (stbi__DNL(m)) { + int Ld = stbi__get16be(j->s); + stbi__uint32 NL = stbi__get16be(j->s); + if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG"); + if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG"); + m = stbi__get_marker(j); + } else { + if (!stbi__process_marker(j, m)) return 1; + m = stbi__get_marker(j); + } + } + if (j->progressive) + stbi__jpeg_finish(j); + return 1; +} + +// static jfif-centered resampling (across block boundaries) + +typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1, + int w, int hs); + +#define stbi__div4(x) ((stbi_uc) ((x) >> 2)) + +static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs) +{ + STBI_NOTUSED(out); + STBI_NOTUSED(in_far); + STBI_NOTUSED(w); + STBI_NOTUSED(hs); + return in_near; +} + +static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs) +{ + // need to generate two samples vertically for every one in input + int i; + STBI_NOTUSED(hs); + for (i=0; i < w; ++i) + out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2); + return out; +} + +static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs) +{ + // need to generate two samples horizontally for every one in input + int i; + stbi_uc *input = in_near; + + if (w == 1) { + // if only one sample, can't do any interpolation + out[0] = out[1] = input[0]; + return out; + } + + out[0] = input[0]; + out[1] = stbi__div4(input[0]*3 + input[1] + 2); + for (i=1; i < w-1; ++i) { + int n = 3*input[i]+2; + out[i*2+0] = stbi__div4(n+input[i-1]); + out[i*2+1] = stbi__div4(n+input[i+1]); + } + out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2); + out[i*2+1] = input[w-1]; + + STBI_NOTUSED(in_far); + STBI_NOTUSED(hs); + + return out; +} + +#define stbi__div16(x) ((stbi_uc) ((x) >> 4)) + +static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs) +{ + // need to generate 2x2 samples for every one in input + int i,t0,t1; + if (w == 1) { + out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2); + return out; + } + + t1 = 3*in_near[0] + in_far[0]; + out[0] = stbi__div4(t1+2); + for (i=1; i < w; ++i) { + t0 = t1; + t1 = 3*in_near[i]+in_far[i]; + out[i*2-1] = stbi__div16(3*t0 + t1 + 8); + out[i*2 ] = stbi__div16(3*t1 + t0 + 8); + } + out[w*2-1] = stbi__div4(t1+2); + + STBI_NOTUSED(hs); + + return out; +} + +#if defined(STBI_SSE2) || defined(STBI_NEON) +static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs) +{ + // need to generate 2x2 samples for every one in input + int i=0,t0,t1; + + if (w == 1) { + out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2); + return out; + } + + t1 = 3*in_near[0] + in_far[0]; + // process groups of 8 pixels for as long as we can. + // note we can't handle the last pixel in a row in this loop + // because we need to handle the filter boundary conditions. + for (; i < ((w-1) & ~7); i += 8) { +#if defined(STBI_SSE2) + // load and perform the vertical filtering pass + // this uses 3*x + y = 4*x + (y - x) + __m128i zero = _mm_setzero_si128(); + __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i)); + __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i)); + __m128i farw = _mm_unpacklo_epi8(farb, zero); + __m128i nearw = _mm_unpacklo_epi8(nearb, zero); + __m128i diff = _mm_sub_epi16(farw, nearw); + __m128i nears = _mm_slli_epi16(nearw, 2); + __m128i curr = _mm_add_epi16(nears, diff); // current row + + // horizontal filter works the same based on shifted vers of current + // row. "prev" is current row shifted right by 1 pixel; we need to + // insert the previous pixel value (from t1). + // "next" is current row shifted left by 1 pixel, with first pixel + // of next block of 8 pixels added in. + __m128i prv0 = _mm_slli_si128(curr, 2); + __m128i nxt0 = _mm_srli_si128(curr, 2); + __m128i prev = _mm_insert_epi16(prv0, t1, 0); + __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7); + + // horizontal filter, polyphase implementation since it's convenient: + // even pixels = 3*cur + prev = cur*4 + (prev - cur) + // odd pixels = 3*cur + next = cur*4 + (next - cur) + // note the shared term. + __m128i bias = _mm_set1_epi16(8); + __m128i curs = _mm_slli_epi16(curr, 2); + __m128i prvd = _mm_sub_epi16(prev, curr); + __m128i nxtd = _mm_sub_epi16(next, curr); + __m128i curb = _mm_add_epi16(curs, bias); + __m128i even = _mm_add_epi16(prvd, curb); + __m128i odd = _mm_add_epi16(nxtd, curb); + + // interleave even and odd pixels, then undo scaling. + __m128i int0 = _mm_unpacklo_epi16(even, odd); + __m128i int1 = _mm_unpackhi_epi16(even, odd); + __m128i de0 = _mm_srli_epi16(int0, 4); + __m128i de1 = _mm_srli_epi16(int1, 4); + + // pack and write output + __m128i outv = _mm_packus_epi16(de0, de1); + _mm_storeu_si128((__m128i *) (out + i*2), outv); +#elif defined(STBI_NEON) + // load and perform the vertical filtering pass + // this uses 3*x + y = 4*x + (y - x) + uint8x8_t farb = vld1_u8(in_far + i); + uint8x8_t nearb = vld1_u8(in_near + i); + int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb)); + int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2)); + int16x8_t curr = vaddq_s16(nears, diff); // current row + + // horizontal filter works the same based on shifted vers of current + // row. "prev" is current row shifted right by 1 pixel; we need to + // insert the previous pixel value (from t1). + // "next" is current row shifted left by 1 pixel, with first pixel + // of next block of 8 pixels added in. + int16x8_t prv0 = vextq_s16(curr, curr, 7); + int16x8_t nxt0 = vextq_s16(curr, curr, 1); + int16x8_t prev = vsetq_lane_s16(t1, prv0, 0); + int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7); + + // horizontal filter, polyphase implementation since it's convenient: + // even pixels = 3*cur + prev = cur*4 + (prev - cur) + // odd pixels = 3*cur + next = cur*4 + (next - cur) + // note the shared term. + int16x8_t curs = vshlq_n_s16(curr, 2); + int16x8_t prvd = vsubq_s16(prev, curr); + int16x8_t nxtd = vsubq_s16(next, curr); + int16x8_t even = vaddq_s16(curs, prvd); + int16x8_t odd = vaddq_s16(curs, nxtd); + + // undo scaling and round, then store with even/odd phases interleaved + uint8x8x2_t o; + o.val[0] = vqrshrun_n_s16(even, 4); + o.val[1] = vqrshrun_n_s16(odd, 4); + vst2_u8(out + i*2, o); +#endif + + // "previous" value for next iter + t1 = 3*in_near[i+7] + in_far[i+7]; + } + + t0 = t1; + t1 = 3*in_near[i] + in_far[i]; + out[i*2] = stbi__div16(3*t1 + t0 + 8); + + for (++i; i < w; ++i) { + t0 = t1; + t1 = 3*in_near[i]+in_far[i]; + out[i*2-1] = stbi__div16(3*t0 + t1 + 8); + out[i*2 ] = stbi__div16(3*t1 + t0 + 8); + } + out[w*2-1] = stbi__div4(t1+2); + + STBI_NOTUSED(hs); + + return out; +} +#endif + +static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs) +{ + // resample with nearest-neighbor + int i,j; + STBI_NOTUSED(in_far); + for (i=0; i < w; ++i) + for (j=0; j < hs; ++j) + out[i*hs+j] = in_near[i]; + return out; +} + +// this is a reduced-precision calculation of YCbCr-to-RGB introduced +// to make sure the code produces the same results in both SIMD and scalar +#define stbi__float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8) +static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step) +{ + int i; + for (i=0; i < count; ++i) { + int y_fixed = (y[i] << 20) + (1<<19); // rounding + int r,g,b; + int cr = pcr[i] - 128; + int cb = pcb[i] - 128; + r = y_fixed + cr* stbi__float2fixed(1.40200f); + g = y_fixed + (cr*-stbi__float2fixed(0.71414f)) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000); + b = y_fixed + cb* stbi__float2fixed(1.77200f); + r >>= 20; + g >>= 20; + b >>= 20; + if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; } + if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; } + if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; } + out[0] = (stbi_uc)r; + out[1] = (stbi_uc)g; + out[2] = (stbi_uc)b; + out[3] = 255; + out += step; + } +} + +#if defined(STBI_SSE2) || defined(STBI_NEON) +static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step) +{ + int i = 0; + +#ifdef STBI_SSE2 + // step == 3 is pretty ugly on the final interleave, and i'm not convinced + // it's useful in practice (you wouldn't use it for textures, for example). + // so just accelerate step == 4 case. + if (step == 4) { + // this is a fairly straightforward implementation and not super-optimized. + __m128i signflip = _mm_set1_epi8(-0x80); + __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f)); + __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f)); + __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f)); + __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f)); + __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128); + __m128i xw = _mm_set1_epi16(255); // alpha channel + + for (; i+7 < count; i += 8) { + // load + __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i)); + __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i)); + __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i)); + __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128 + __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128 + + // unpack to short (and left-shift cr, cb by 8) + __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes); + __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased); + __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased); + + // color transform + __m128i yws = _mm_srli_epi16(yw, 4); + __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw); + __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw); + __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1); + __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1); + __m128i rws = _mm_add_epi16(cr0, yws); + __m128i gwt = _mm_add_epi16(cb0, yws); + __m128i bws = _mm_add_epi16(yws, cb1); + __m128i gws = _mm_add_epi16(gwt, cr1); + + // descale + __m128i rw = _mm_srai_epi16(rws, 4); + __m128i bw = _mm_srai_epi16(bws, 4); + __m128i gw = _mm_srai_epi16(gws, 4); + + // back to byte, set up for transpose + __m128i brb = _mm_packus_epi16(rw, bw); + __m128i gxb = _mm_packus_epi16(gw, xw); + + // transpose to interleave channels + __m128i t0 = _mm_unpacklo_epi8(brb, gxb); + __m128i t1 = _mm_unpackhi_epi8(brb, gxb); + __m128i o0 = _mm_unpacklo_epi16(t0, t1); + __m128i o1 = _mm_unpackhi_epi16(t0, t1); + + // store + _mm_storeu_si128((__m128i *) (out + 0), o0); + _mm_storeu_si128((__m128i *) (out + 16), o1); + out += 32; + } + } +#endif + +#ifdef STBI_NEON + // in this version, step=3 support would be easy to add. but is there demand? + if (step == 4) { + // this is a fairly straightforward implementation and not super-optimized. + uint8x8_t signflip = vdup_n_u8(0x80); + int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f)); + int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f)); + int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f)); + int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f)); + + for (; i+7 < count; i += 8) { + // load + uint8x8_t y_bytes = vld1_u8(y + i); + uint8x8_t cr_bytes = vld1_u8(pcr + i); + uint8x8_t cb_bytes = vld1_u8(pcb + i); + int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip)); + int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip)); + + // expand to s16 + int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4)); + int16x8_t crw = vshll_n_s8(cr_biased, 7); + int16x8_t cbw = vshll_n_s8(cb_biased, 7); + + // color transform + int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0); + int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0); + int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1); + int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1); + int16x8_t rws = vaddq_s16(yws, cr0); + int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1); + int16x8_t bws = vaddq_s16(yws, cb1); + + // undo scaling, round, convert to byte + uint8x8x4_t o; + o.val[0] = vqrshrun_n_s16(rws, 4); + o.val[1] = vqrshrun_n_s16(gws, 4); + o.val[2] = vqrshrun_n_s16(bws, 4); + o.val[3] = vdup_n_u8(255); + + // store, interleaving r/g/b/a + vst4_u8(out, o); + out += 8*4; + } + } +#endif + + for (; i < count; ++i) { + int y_fixed = (y[i] << 20) + (1<<19); // rounding + int r,g,b; + int cr = pcr[i] - 128; + int cb = pcb[i] - 128; + r = y_fixed + cr* stbi__float2fixed(1.40200f); + g = y_fixed + cr*-stbi__float2fixed(0.71414f) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000); + b = y_fixed + cb* stbi__float2fixed(1.77200f); + r >>= 20; + g >>= 20; + b >>= 20; + if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; } + if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; } + if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; } + out[0] = (stbi_uc)r; + out[1] = (stbi_uc)g; + out[2] = (stbi_uc)b; + out[3] = 255; + out += step; + } +} +#endif + +// set up the kernels +static void stbi__setup_jpeg(stbi__jpeg *j) +{ + j->idct_block_kernel = stbi__idct_block; + j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row; + j->resample_row_hv_2_kernel = stbi__resample_row_hv_2; + +#ifdef STBI_SSE2 + if (stbi__sse2_available()) { + j->idct_block_kernel = stbi__idct_simd; + j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd; + j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd; + } +#endif + +#ifdef STBI_NEON + j->idct_block_kernel = stbi__idct_simd; + j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd; + j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd; +#endif +} + +// clean up the temporary component buffers +static void stbi__cleanup_jpeg(stbi__jpeg *j) +{ + stbi__free_jpeg_components(j, j->s->img_n, 0); +} + +typedef struct +{ + resample_row_func resample; + stbi_uc *line0,*line1; + int hs,vs; // expansion factor in each axis + int w_lores; // horizontal pixels pre-expansion + int ystep; // how far through vertical expansion we are + int ypos; // which pre-expansion row we're on +} stbi__resample; + +// fast 0..255 * 0..255 => 0..255 rounded multiplication +static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y) +{ + unsigned int t = x*y + 128; + return (stbi_uc) ((t + (t >>8)) >> 8); +} + +static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp) +{ + int n, decode_n, is_rgb; + z->s->img_n = 0; // make stbi__cleanup_jpeg safe + + // validate req_comp + if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error"); + + // load a jpeg image from whichever source, but leave in YCbCr format + if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; } + + // determine actual number of components to generate + n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1; + + is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif)); + + if (z->s->img_n == 3 && n < 3 && !is_rgb) + decode_n = 1; + else + decode_n = z->s->img_n; + + // nothing to do if no components requested; check this now to avoid + // accessing uninitialized coutput[0] later + if (decode_n <= 0) { stbi__cleanup_jpeg(z); return NULL; } + + // resample and color-convert + { + int k; + unsigned int i,j; + stbi_uc *output; + stbi_uc *coutput[4] = { NULL, NULL, NULL, NULL }; + + stbi__resample res_comp[4]; + + for (k=0; k < decode_n; ++k) { + stbi__resample *r = &res_comp[k]; + + // allocate line buffer big enough for upsampling off the edges + // with upsample factor of 4 + z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3); + if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); } + + r->hs = z->img_h_max / z->img_comp[k].h; + r->vs = z->img_v_max / z->img_comp[k].v; + r->ystep = r->vs >> 1; + r->w_lores = (z->s->img_x + r->hs-1) / r->hs; + r->ypos = 0; + r->line0 = r->line1 = z->img_comp[k].data; + + if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1; + else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2; + else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2; + else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel; + else r->resample = stbi__resample_row_generic; + } + + // can't error after this so, this is safe + output = (stbi_uc *) stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1); + if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); } + + // now go ahead and resample + for (j=0; j < z->s->img_y; ++j) { + stbi_uc *out = output + n * z->s->img_x * j; + for (k=0; k < decode_n; ++k) { + stbi__resample *r = &res_comp[k]; + int y_bot = r->ystep >= (r->vs >> 1); + coutput[k] = r->resample(z->img_comp[k].linebuf, + y_bot ? r->line1 : r->line0, + y_bot ? r->line0 : r->line1, + r->w_lores, r->hs); + if (++r->ystep >= r->vs) { + r->ystep = 0; + r->line0 = r->line1; + if (++r->ypos < z->img_comp[k].y) + r->line1 += z->img_comp[k].w2; + } + } + if (n >= 3) { + stbi_uc *y = coutput[0]; + if (z->s->img_n == 3) { + if (is_rgb) { + for (i=0; i < z->s->img_x; ++i) { + out[0] = y[i]; + out[1] = coutput[1][i]; + out[2] = coutput[2][i]; + out[3] = 255; + out += n; + } + } else { + z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n); + } + } else if (z->s->img_n == 4) { + if (z->app14_color_transform == 0) { // CMYK + for (i=0; i < z->s->img_x; ++i) { + stbi_uc m = coutput[3][i]; + out[0] = stbi__blinn_8x8(coutput[0][i], m); + out[1] = stbi__blinn_8x8(coutput[1][i], m); + out[2] = stbi__blinn_8x8(coutput[2][i], m); + out[3] = 255; + out += n; + } + } else if (z->app14_color_transform == 2) { // YCCK + z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n); + for (i=0; i < z->s->img_x; ++i) { + stbi_uc m = coutput[3][i]; + out[0] = stbi__blinn_8x8(255 - out[0], m); + out[1] = stbi__blinn_8x8(255 - out[1], m); + out[2] = stbi__blinn_8x8(255 - out[2], m); + out += n; + } + } else { // YCbCr + alpha? Ignore the fourth channel for now + z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n); + } + } else + for (i=0; i < z->s->img_x; ++i) { + out[0] = out[1] = out[2] = y[i]; + out[3] = 255; // not used if n==3 + out += n; + } + } else { + if (is_rgb) { + if (n == 1) + for (i=0; i < z->s->img_x; ++i) + *out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]); + else { + for (i=0; i < z->s->img_x; ++i, out += 2) { + out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]); + out[1] = 255; + } + } + } else if (z->s->img_n == 4 && z->app14_color_transform == 0) { + for (i=0; i < z->s->img_x; ++i) { + stbi_uc m = coutput[3][i]; + stbi_uc r = stbi__blinn_8x8(coutput[0][i], m); + stbi_uc g = stbi__blinn_8x8(coutput[1][i], m); + stbi_uc b = stbi__blinn_8x8(coutput[2][i], m); + out[0] = stbi__compute_y(r, g, b); + out[1] = 255; + out += n; + } + } else if (z->s->img_n == 4 && z->app14_color_transform == 2) { + for (i=0; i < z->s->img_x; ++i) { + out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]); + out[1] = 255; + out += n; + } + } else { + stbi_uc *y = coutput[0]; + if (n == 1) + for (i=0; i < z->s->img_x; ++i) out[i] = y[i]; + else + for (i=0; i < z->s->img_x; ++i) { *out++ = y[i]; *out++ = 255; } + } + } + } + stbi__cleanup_jpeg(z); + *out_x = z->s->img_x; + *out_y = z->s->img_y; + if (comp) *comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output + return output; + } +} + +static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri) +{ + unsigned char* result; + stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg)); + if (!j) return stbi__errpuc("outofmem", "Out of memory"); + memset(j, 0, sizeof(stbi__jpeg)); + STBI_NOTUSED(ri); + j->s = s; + stbi__setup_jpeg(j); + result = load_jpeg_image(j, x,y,comp,req_comp); + STBI_FREE(j); + return result; +} + +static int stbi__jpeg_test(stbi__context *s) +{ + int r; + stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg)); + if (!j) return stbi__err("outofmem", "Out of memory"); + memset(j, 0, sizeof(stbi__jpeg)); + j->s = s; + stbi__setup_jpeg(j); + r = stbi__decode_jpeg_header(j, STBI__SCAN_type); + stbi__rewind(s); + STBI_FREE(j); + return r; +} + +static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp) +{ + if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) { + stbi__rewind( j->s ); + return 0; + } + if (x) *x = j->s->img_x; + if (y) *y = j->s->img_y; + if (comp) *comp = j->s->img_n >= 3 ? 3 : 1; + return 1; +} + +static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp) +{ + int result; + stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg))); + if (!j) return stbi__err("outofmem", "Out of memory"); + memset(j, 0, sizeof(stbi__jpeg)); + j->s = s; + result = stbi__jpeg_info_raw(j, x, y, comp); + STBI_FREE(j); + return result; +} +#endif + +// public domain zlib decode v0.2 Sean Barrett 2006-11-18 +// simple implementation +// - all input must be provided in an upfront buffer +// - all output is written to a single output buffer (can malloc/realloc) +// performance +// - fast huffman + +#ifndef STBI_NO_ZLIB + +// fast-way is faster to check than jpeg huffman, but slow way is slower +#define STBI__ZFAST_BITS 9 // accelerate all cases in default tables +#define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1) +#define STBI__ZNSYMS 288 // number of symbols in literal/length alphabet + +// zlib-style huffman encoding +// (jpegs packs from left, zlib from right, so can't share code) +typedef struct +{ + stbi__uint16 fast[1 << STBI__ZFAST_BITS]; + stbi__uint16 firstcode[16]; + int maxcode[17]; + stbi__uint16 firstsymbol[16]; + stbi_uc size[STBI__ZNSYMS]; + stbi__uint16 value[STBI__ZNSYMS]; +} stbi__zhuffman; + +stbi_inline static int stbi__bitreverse16(int n) +{ + n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1); + n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2); + n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4); + n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8); + return n; +} + +stbi_inline static int stbi__bit_reverse(int v, int bits) +{ + STBI_ASSERT(bits <= 16); + // to bit reverse n bits, reverse 16 and shift + // e.g. 11 bits, bit reverse and shift away 5 + return stbi__bitreverse16(v) >> (16-bits); +} + +static int stbi__zbuild_huffman(stbi__zhuffman *z, const stbi_uc *sizelist, int num) +{ + int i,k=0; + int code, next_code[16], sizes[17]; + + // DEFLATE spec for generating codes + memset(sizes, 0, sizeof(sizes)); + memset(z->fast, 0, sizeof(z->fast)); + for (i=0; i < num; ++i) + ++sizes[sizelist[i]]; + sizes[0] = 0; + for (i=1; i < 16; ++i) + if (sizes[i] > (1 << i)) + return stbi__err("bad sizes", "Corrupt PNG"); + code = 0; + for (i=1; i < 16; ++i) { + next_code[i] = code; + z->firstcode[i] = (stbi__uint16) code; + z->firstsymbol[i] = (stbi__uint16) k; + code = (code + sizes[i]); + if (sizes[i]) + if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG"); + z->maxcode[i] = code << (16-i); // preshift for inner loop + code <<= 1; + k += sizes[i]; + } + z->maxcode[16] = 0x10000; // sentinel + for (i=0; i < num; ++i) { + int s = sizelist[i]; + if (s) { + int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s]; + stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i); + z->size [c] = (stbi_uc ) s; + z->value[c] = (stbi__uint16) i; + if (s <= STBI__ZFAST_BITS) { + int j = stbi__bit_reverse(next_code[s],s); + while (j < (1 << STBI__ZFAST_BITS)) { + z->fast[j] = fastv; + j += (1 << s); + } + } + ++next_code[s]; + } + } + return 1; +} + +// zlib-from-memory implementation for PNG reading +// because PNG allows splitting the zlib stream arbitrarily, +// and it's annoying structurally to have PNG call ZLIB call PNG, +// we require PNG read all the IDATs and combine them into a single +// memory buffer + +typedef struct +{ + stbi_uc *zbuffer, *zbuffer_end; + int num_bits; + stbi__uint32 code_buffer; + + char *zout; + char *zout_start; + char *zout_end; + int z_expandable; + + stbi__zhuffman z_length, z_distance; +} stbi__zbuf; + +stbi_inline static int stbi__zeof(stbi__zbuf *z) +{ + return (z->zbuffer >= z->zbuffer_end); +} + +stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z) +{ + return stbi__zeof(z) ? 0 : *z->zbuffer++; +} + +static void stbi__fill_bits(stbi__zbuf *z) +{ + do { + if (z->code_buffer >= (1U << z->num_bits)) { + z->zbuffer = z->zbuffer_end; /* treat this as EOF so we fail. */ + return; + } + z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits; + z->num_bits += 8; + } while (z->num_bits <= 24); +} + +stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n) +{ + unsigned int k; + if (z->num_bits < n) stbi__fill_bits(z); + k = z->code_buffer & ((1 << n) - 1); + z->code_buffer >>= n; + z->num_bits -= n; + return k; +} + +static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z) +{ + int b,s,k; + // not resolved by fast table, so compute it the slow way + // use jpeg approach, which requires MSbits at top + k = stbi__bit_reverse(a->code_buffer, 16); + for (s=STBI__ZFAST_BITS+1; ; ++s) + if (k < z->maxcode[s]) + break; + if (s >= 16) return -1; // invalid code! + // code size is s, so: + b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s]; + if (b >= STBI__ZNSYMS) return -1; // some data was corrupt somewhere! + if (z->size[b] != s) return -1; // was originally an assert, but report failure instead. + a->code_buffer >>= s; + a->num_bits -= s; + return z->value[b]; +} + +stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z) +{ + int b,s; + if (a->num_bits < 16) { + if (stbi__zeof(a)) { + return -1; /* report error for unexpected end of data. */ + } + stbi__fill_bits(a); + } + b = z->fast[a->code_buffer & STBI__ZFAST_MASK]; + if (b) { + s = b >> 9; + a->code_buffer >>= s; + a->num_bits -= s; + return b & 511; + } + return stbi__zhuffman_decode_slowpath(a, z); +} + +static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes +{ + char *q; + unsigned int cur, limit, old_limit; + z->zout = zout; + if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG"); + cur = (unsigned int) (z->zout - z->zout_start); + limit = old_limit = (unsigned) (z->zout_end - z->zout_start); + if (UINT_MAX - cur < (unsigned) n) return stbi__err("outofmem", "Out of memory"); + while (cur + n > limit) { + if(limit > UINT_MAX / 2) return stbi__err("outofmem", "Out of memory"); + limit *= 2; + } + q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit); + STBI_NOTUSED(old_limit); + if (q == NULL) return stbi__err("outofmem", "Out of memory"); + z->zout_start = q; + z->zout = q + cur; + z->zout_end = q + limit; + return 1; +} + +static const int stbi__zlength_base[31] = { + 3,4,5,6,7,8,9,10,11,13, + 15,17,19,23,27,31,35,43,51,59, + 67,83,99,115,131,163,195,227,258,0,0 }; + +static const int stbi__zlength_extra[31]= +{ 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 }; + +static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193, +257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0}; + +static const int stbi__zdist_extra[32] = +{ 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13}; + +static int stbi__parse_huffman_block(stbi__zbuf *a) +{ + char *zout = a->zout; + for(;;) { + int z = stbi__zhuffman_decode(a, &a->z_length); + if (z < 256) { + if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes + if (zout >= a->zout_end) { + if (!stbi__zexpand(a, zout, 1)) return 0; + zout = a->zout; + } + *zout++ = (char) z; + } else { + stbi_uc *p; + int len,dist; + if (z == 256) { + a->zout = zout; + return 1; + } + if (z >= 286) return stbi__err("bad huffman code","Corrupt PNG"); // per DEFLATE, length codes 286 and 287 must not appear in compressed data + z -= 257; + len = stbi__zlength_base[z]; + if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]); + z = stbi__zhuffman_decode(a, &a->z_distance); + if (z < 0 || z >= 30) return stbi__err("bad huffman code","Corrupt PNG"); // per DEFLATE, distance codes 30 and 31 must not appear in compressed data + dist = stbi__zdist_base[z]; + if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]); + if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG"); + if (zout + len > a->zout_end) { + if (!stbi__zexpand(a, zout, len)) return 0; + zout = a->zout; + } + p = (stbi_uc *) (zout - dist); + if (dist == 1) { // run of one byte; common in images. + stbi_uc v = *p; + if (len) { do *zout++ = v; while (--len); } + } else { + if (len) { do *zout++ = *p++; while (--len); } + } + } + } +} + +static int stbi__compute_huffman_codes(stbi__zbuf *a) +{ + static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 }; + stbi__zhuffman z_codelength; + stbi_uc lencodes[286+32+137];//padding for maximum single op + stbi_uc codelength_sizes[19]; + int i,n; + + int hlit = stbi__zreceive(a,5) + 257; + int hdist = stbi__zreceive(a,5) + 1; + int hclen = stbi__zreceive(a,4) + 4; + int ntot = hlit + hdist; + + memset(codelength_sizes, 0, sizeof(codelength_sizes)); + for (i=0; i < hclen; ++i) { + int s = stbi__zreceive(a,3); + codelength_sizes[length_dezigzag[i]] = (stbi_uc) s; + } + if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0; + + n = 0; + while (n < ntot) { + int c = stbi__zhuffman_decode(a, &z_codelength); + if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG"); + if (c < 16) + lencodes[n++] = (stbi_uc) c; + else { + stbi_uc fill = 0; + if (c == 16) { + c = stbi__zreceive(a,2)+3; + if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG"); + fill = lencodes[n-1]; + } else if (c == 17) { + c = stbi__zreceive(a,3)+3; + } else if (c == 18) { + c = stbi__zreceive(a,7)+11; + } else { + return stbi__err("bad codelengths", "Corrupt PNG"); + } + if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG"); + memset(lencodes+n, fill, c); + n += c; + } + } + if (n != ntot) return stbi__err("bad codelengths","Corrupt PNG"); + if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0; + if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0; + return 1; +} + +static int stbi__parse_uncompressed_block(stbi__zbuf *a) +{ + stbi_uc header[4]; + int len,nlen,k; + if (a->num_bits & 7) + stbi__zreceive(a, a->num_bits & 7); // discard + // drain the bit-packed data into header + k = 0; + while (a->num_bits > 0) { + header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check + a->code_buffer >>= 8; + a->num_bits -= 8; + } + if (a->num_bits < 0) return stbi__err("zlib corrupt","Corrupt PNG"); + // now fill header the normal way + while (k < 4) + header[k++] = stbi__zget8(a); + len = header[1] * 256 + header[0]; + nlen = header[3] * 256 + header[2]; + if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG"); + if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG"); + if (a->zout + len > a->zout_end) + if (!stbi__zexpand(a, a->zout, len)) return 0; + memcpy(a->zout, a->zbuffer, len); + a->zbuffer += len; + a->zout += len; + return 1; +} + +static int stbi__parse_zlib_header(stbi__zbuf *a) +{ + int cmf = stbi__zget8(a); + int cm = cmf & 15; + /* int cinfo = cmf >> 4; */ + int flg = stbi__zget8(a); + if (stbi__zeof(a)) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec + if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec + if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png + if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png + // window = 1 << (8 + cinfo)... but who cares, we fully buffer output + return 1; +} + +static const stbi_uc stbi__zdefault_length[STBI__ZNSYMS] = +{ + 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, + 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, + 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, + 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, + 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, + 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, + 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, + 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, + 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8 +}; +static const stbi_uc stbi__zdefault_distance[32] = +{ + 5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5 +}; +/* +Init algorithm: +{ + int i; // use <= to match clearly with spec + for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8; + for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9; + for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7; + for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8; + + for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5; +} +*/ + +static int stbi__parse_zlib(stbi__zbuf *a, int parse_header) +{ + int final, type; + if (parse_header) + if (!stbi__parse_zlib_header(a)) return 0; + a->num_bits = 0; + a->code_buffer = 0; + do { + final = stbi__zreceive(a,1); + type = stbi__zreceive(a,2); + if (type == 0) { + if (!stbi__parse_uncompressed_block(a)) return 0; + } else if (type == 3) { + return 0; + } else { + if (type == 1) { + // use fixed code lengths + if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , STBI__ZNSYMS)) return 0; + if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0; + } else { + if (!stbi__compute_huffman_codes(a)) return 0; + } + if (!stbi__parse_huffman_block(a)) return 0; + } + } while (!final); + return 1; +} + +static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header) +{ + a->zout_start = obuf; + a->zout = obuf; + a->zout_end = obuf + olen; + a->z_expandable = exp; + + return stbi__parse_zlib(a, parse_header); +} + +STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen) +{ + stbi__zbuf a; + char *p = (char *) stbi__malloc(initial_size); + if (p == NULL) return NULL; + a.zbuffer = (stbi_uc *) buffer; + a.zbuffer_end = (stbi_uc *) buffer + len; + if (stbi__do_zlib(&a, p, initial_size, 1, 1)) { + if (outlen) *outlen = (int) (a.zout - a.zout_start); + return a.zout_start; + } else { + STBI_FREE(a.zout_start); + return NULL; + } +} + +STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen) +{ + return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen); +} + +STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header) +{ + stbi__zbuf a; + char *p = (char *) stbi__malloc(initial_size); + if (p == NULL) return NULL; + a.zbuffer = (stbi_uc *) buffer; + a.zbuffer_end = (stbi_uc *) buffer + len; + if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) { + if (outlen) *outlen = (int) (a.zout - a.zout_start); + return a.zout_start; + } else { + STBI_FREE(a.zout_start); + return NULL; + } +} + +STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen) +{ + stbi__zbuf a; + a.zbuffer = (stbi_uc *) ibuffer; + a.zbuffer_end = (stbi_uc *) ibuffer + ilen; + if (stbi__do_zlib(&a, obuffer, olen, 0, 1)) + return (int) (a.zout - a.zout_start); + else + return -1; +} + +STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen) +{ + stbi__zbuf a; + char *p = (char *) stbi__malloc(16384); + if (p == NULL) return NULL; + a.zbuffer = (stbi_uc *) buffer; + a.zbuffer_end = (stbi_uc *) buffer+len; + if (stbi__do_zlib(&a, p, 16384, 1, 0)) { + if (outlen) *outlen = (int) (a.zout - a.zout_start); + return a.zout_start; + } else { + STBI_FREE(a.zout_start); + return NULL; + } +} + +STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen) +{ + stbi__zbuf a; + a.zbuffer = (stbi_uc *) ibuffer; + a.zbuffer_end = (stbi_uc *) ibuffer + ilen; + if (stbi__do_zlib(&a, obuffer, olen, 0, 0)) + return (int) (a.zout - a.zout_start); + else + return -1; +} +#endif + +// public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18 +// simple implementation +// - only 8-bit samples +// - no CRC checking +// - allocates lots of intermediate memory +// - avoids problem of streaming data between subsystems +// - avoids explicit window management +// performance +// - uses stb_zlib, a PD zlib implementation with fast huffman decoding + +#ifndef STBI_NO_PNG +typedef struct +{ + stbi__uint32 length; + stbi__uint32 type; +} stbi__pngchunk; + +static stbi__pngchunk stbi__get_chunk_header(stbi__context *s) +{ + stbi__pngchunk c; + c.length = stbi__get32be(s); + c.type = stbi__get32be(s); + return c; +} + +static int stbi__check_png_header(stbi__context *s) +{ + static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 }; + int i; + for (i=0; i < 8; ++i) + if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG"); + return 1; +} + +typedef struct +{ + stbi__context *s; + stbi_uc *idata, *expanded, *out; + int depth; +} stbi__png; + + +enum { + STBI__F_none=0, + STBI__F_sub=1, + STBI__F_up=2, + STBI__F_avg=3, + STBI__F_paeth=4, + // synthetic filters used for first scanline to avoid needing a dummy row of 0s + STBI__F_avg_first, + STBI__F_paeth_first +}; + +static stbi_uc first_row_filter[5] = +{ + STBI__F_none, + STBI__F_sub, + STBI__F_none, + STBI__F_avg_first, + STBI__F_paeth_first +}; + +static int stbi__paeth(int a, int b, int c) +{ + int p = a + b - c; + int pa = abs(p-a); + int pb = abs(p-b); + int pc = abs(p-c); + if (pa <= pb && pa <= pc) return a; + if (pb <= pc) return b; + return c; +} + +static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 }; + +// create the png data from post-deflated data +static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color) +{ + int bytes = (depth == 16? 2 : 1); + stbi__context *s = a->s; + stbi__uint32 i,j,stride = x*out_n*bytes; + stbi__uint32 img_len, img_width_bytes; + int k; + int img_n = s->img_n; // copy it into a local for later + + int output_bytes = out_n*bytes; + int filter_bytes = img_n*bytes; + int width = x; + + STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1); + a->out = (stbi_uc *) stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into + if (!a->out) return stbi__err("outofmem", "Out of memory"); + + if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG"); + img_width_bytes = (((img_n * x * depth) + 7) >> 3); + img_len = (img_width_bytes + 1) * y; + + // we used to check for exact match between raw_len and img_len on non-interlaced PNGs, + // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros), + // so just check for raw_len < img_len always. + if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG"); + + for (j=0; j < y; ++j) { + stbi_uc *cur = a->out + stride*j; + stbi_uc *prior; + int filter = *raw++; + + if (filter > 4) + return stbi__err("invalid filter","Corrupt PNG"); + + if (depth < 8) { + if (img_width_bytes > x) return stbi__err("invalid width","Corrupt PNG"); + cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place + filter_bytes = 1; + width = img_width_bytes; + } + prior = cur - stride; // bugfix: need to compute this after 'cur +=' computation above + + // if first row, use special filter that doesn't sample previous row + if (j == 0) filter = first_row_filter[filter]; + + // handle first byte explicitly + for (k=0; k < filter_bytes; ++k) { + switch (filter) { + case STBI__F_none : cur[k] = raw[k]; break; + case STBI__F_sub : cur[k] = raw[k]; break; + case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break; + case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break; + case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break; + case STBI__F_avg_first : cur[k] = raw[k]; break; + case STBI__F_paeth_first: cur[k] = raw[k]; break; + } + } + + if (depth == 8) { + if (img_n != out_n) + cur[img_n] = 255; // first pixel + raw += img_n; + cur += out_n; + prior += out_n; + } else if (depth == 16) { + if (img_n != out_n) { + cur[filter_bytes] = 255; // first pixel top byte + cur[filter_bytes+1] = 255; // first pixel bottom byte + } + raw += filter_bytes; + cur += output_bytes; + prior += output_bytes; + } else { + raw += 1; + cur += 1; + prior += 1; + } + + // this is a little gross, so that we don't switch per-pixel or per-component + if (depth < 8 || img_n == out_n) { + int nk = (width - 1)*filter_bytes; + #define STBI__CASE(f) \ + case f: \ + for (k=0; k < nk; ++k) + switch (filter) { + // "none" filter turns into a memcpy here; make that explicit. + case STBI__F_none: memcpy(cur, raw, nk); break; + STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); } break; + STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break; + STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); } break; + STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); } break; + STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); } break; + STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); } break; + } + #undef STBI__CASE + raw += nk; + } else { + STBI_ASSERT(img_n+1 == out_n); + #define STBI__CASE(f) \ + case f: \ + for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \ + for (k=0; k < filter_bytes; ++k) + switch (filter) { + STBI__CASE(STBI__F_none) { cur[k] = raw[k]; } break; + STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); } break; + STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break; + STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); } break; + STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); } break; + STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); } break; + STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); } break; + } + #undef STBI__CASE + + // the loop above sets the high byte of the pixels' alpha, but for + // 16 bit png files we also need the low byte set. we'll do that here. + if (depth == 16) { + cur = a->out + stride*j; // start at the beginning of the row again + for (i=0; i < x; ++i,cur+=output_bytes) { + cur[filter_bytes+1] = 255; + } + } + } + } + + // we make a separate pass to expand bits to pixels; for performance, + // this could run two scanlines behind the above code, so it won't + // intefere with filtering but will still be in the cache. + if (depth < 8) { + for (j=0; j < y; ++j) { + stbi_uc *cur = a->out + stride*j; + stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes; + // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit + // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop + stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range + + // note that the final byte might overshoot and write more data than desired. + // we can allocate enough data that this never writes out of memory, but it + // could also overwrite the next scanline. can it overwrite non-empty data + // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel. + // so we need to explicitly clamp the final ones + + if (depth == 4) { + for (k=x*img_n; k >= 2; k-=2, ++in) { + *cur++ = scale * ((*in >> 4) ); + *cur++ = scale * ((*in ) & 0x0f); + } + if (k > 0) *cur++ = scale * ((*in >> 4) ); + } else if (depth == 2) { + for (k=x*img_n; k >= 4; k-=4, ++in) { + *cur++ = scale * ((*in >> 6) ); + *cur++ = scale * ((*in >> 4) & 0x03); + *cur++ = scale * ((*in >> 2) & 0x03); + *cur++ = scale * ((*in ) & 0x03); + } + if (k > 0) *cur++ = scale * ((*in >> 6) ); + if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03); + if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03); + } else if (depth == 1) { + for (k=x*img_n; k >= 8; k-=8, ++in) { + *cur++ = scale * ((*in >> 7) ); + *cur++ = scale * ((*in >> 6) & 0x01); + *cur++ = scale * ((*in >> 5) & 0x01); + *cur++ = scale * ((*in >> 4) & 0x01); + *cur++ = scale * ((*in >> 3) & 0x01); + *cur++ = scale * ((*in >> 2) & 0x01); + *cur++ = scale * ((*in >> 1) & 0x01); + *cur++ = scale * ((*in ) & 0x01); + } + if (k > 0) *cur++ = scale * ((*in >> 7) ); + if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01); + if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01); + if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01); + if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01); + if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01); + if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01); + } + if (img_n != out_n) { + int q; + // insert alpha = 255 + cur = a->out + stride*j; + if (img_n == 1) { + for (q=x-1; q >= 0; --q) { + cur[q*2+1] = 255; + cur[q*2+0] = cur[q]; + } + } else { + STBI_ASSERT(img_n == 3); + for (q=x-1; q >= 0; --q) { + cur[q*4+3] = 255; + cur[q*4+2] = cur[q*3+2]; + cur[q*4+1] = cur[q*3+1]; + cur[q*4+0] = cur[q*3+0]; + } + } + } + } + } else if (depth == 16) { + // force the image data from big-endian to platform-native. + // this is done in a separate pass due to the decoding relying + // on the data being untouched, but could probably be done + // per-line during decode if care is taken. + stbi_uc *cur = a->out; + stbi__uint16 *cur16 = (stbi__uint16*)cur; + + for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) { + *cur16 = (cur[0] << 8) | cur[1]; + } + } + + return 1; +} + +static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced) +{ + int bytes = (depth == 16 ? 2 : 1); + int out_bytes = out_n * bytes; + stbi_uc *final; + int p; + if (!interlaced) + return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color); + + // de-interlacing + final = (stbi_uc *) stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0); + if (!final) return stbi__err("outofmem", "Out of memory"); + for (p=0; p < 7; ++p) { + int xorig[] = { 0,4,0,2,0,1,0 }; + int yorig[] = { 0,0,4,0,2,0,1 }; + int xspc[] = { 8,8,4,4,2,2,1 }; + int yspc[] = { 8,8,8,4,4,2,2 }; + int i,j,x,y; + // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1 + x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p]; + y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p]; + if (x && y) { + stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y; + if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) { + STBI_FREE(final); + return 0; + } + for (j=0; j < y; ++j) { + for (i=0; i < x; ++i) { + int out_y = j*yspc[p]+yorig[p]; + int out_x = i*xspc[p]+xorig[p]; + memcpy(final + out_y*a->s->img_x*out_bytes + out_x*out_bytes, + a->out + (j*x+i)*out_bytes, out_bytes); + } + } + STBI_FREE(a->out); + image_data += img_len; + image_data_len -= img_len; + } + } + a->out = final; + + return 1; +} + +static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n) +{ + stbi__context *s = z->s; + stbi__uint32 i, pixel_count = s->img_x * s->img_y; + stbi_uc *p = z->out; + + // compute color-based transparency, assuming we've + // already got 255 as the alpha value in the output + STBI_ASSERT(out_n == 2 || out_n == 4); + + if (out_n == 2) { + for (i=0; i < pixel_count; ++i) { + p[1] = (p[0] == tc[0] ? 0 : 255); + p += 2; + } + } else { + for (i=0; i < pixel_count; ++i) { + if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2]) + p[3] = 0; + p += 4; + } + } + return 1; +} + +static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n) +{ + stbi__context *s = z->s; + stbi__uint32 i, pixel_count = s->img_x * s->img_y; + stbi__uint16 *p = (stbi__uint16*) z->out; + + // compute color-based transparency, assuming we've + // already got 65535 as the alpha value in the output + STBI_ASSERT(out_n == 2 || out_n == 4); + + if (out_n == 2) { + for (i = 0; i < pixel_count; ++i) { + p[1] = (p[0] == tc[0] ? 0 : 65535); + p += 2; + } + } else { + for (i = 0; i < pixel_count; ++i) { + if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2]) + p[3] = 0; + p += 4; + } + } + return 1; +} + +static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n) +{ + stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y; + stbi_uc *p, *temp_out, *orig = a->out; + + p = (stbi_uc *) stbi__malloc_mad2(pixel_count, pal_img_n, 0); + if (p == NULL) return stbi__err("outofmem", "Out of memory"); + + // between here and free(out) below, exitting would leak + temp_out = p; + + if (pal_img_n == 3) { + for (i=0; i < pixel_count; ++i) { + int n = orig[i]*4; + p[0] = palette[n ]; + p[1] = palette[n+1]; + p[2] = palette[n+2]; + p += 3; + } + } else { + for (i=0; i < pixel_count; ++i) { + int n = orig[i]*4; + p[0] = palette[n ]; + p[1] = palette[n+1]; + p[2] = palette[n+2]; + p[3] = palette[n+3]; + p += 4; + } + } + STBI_FREE(a->out); + a->out = temp_out; + + STBI_NOTUSED(len); + + return 1; +} + +static int stbi__unpremultiply_on_load_global = 0; +static int stbi__de_iphone_flag_global = 0; + +STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply) +{ + stbi__unpremultiply_on_load_global = flag_true_if_should_unpremultiply; +} + +STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert) +{ + stbi__de_iphone_flag_global = flag_true_if_should_convert; +} + +#ifndef STBI_THREAD_LOCAL +#define stbi__unpremultiply_on_load stbi__unpremultiply_on_load_global +#define stbi__de_iphone_flag stbi__de_iphone_flag_global +#else +static STBI_THREAD_LOCAL int stbi__unpremultiply_on_load_local, stbi__unpremultiply_on_load_set; +static STBI_THREAD_LOCAL int stbi__de_iphone_flag_local, stbi__de_iphone_flag_set; + +STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply) +{ + stbi__unpremultiply_on_load_local = flag_true_if_should_unpremultiply; + stbi__unpremultiply_on_load_set = 1; +} + +STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert) +{ + stbi__de_iphone_flag_local = flag_true_if_should_convert; + stbi__de_iphone_flag_set = 1; +} + +#define stbi__unpremultiply_on_load (stbi__unpremultiply_on_load_set \ + ? stbi__unpremultiply_on_load_local \ + : stbi__unpremultiply_on_load_global) +#define stbi__de_iphone_flag (stbi__de_iphone_flag_set \ + ? stbi__de_iphone_flag_local \ + : stbi__de_iphone_flag_global) +#endif // STBI_THREAD_LOCAL + +static void stbi__de_iphone(stbi__png *z) +{ + stbi__context *s = z->s; + stbi__uint32 i, pixel_count = s->img_x * s->img_y; + stbi_uc *p = z->out; + + if (s->img_out_n == 3) { // convert bgr to rgb + for (i=0; i < pixel_count; ++i) { + stbi_uc t = p[0]; + p[0] = p[2]; + p[2] = t; + p += 3; + } + } else { + STBI_ASSERT(s->img_out_n == 4); + if (stbi__unpremultiply_on_load) { + // convert bgr to rgb and unpremultiply + for (i=0; i < pixel_count; ++i) { + stbi_uc a = p[3]; + stbi_uc t = p[0]; + if (a) { + stbi_uc half = a / 2; + p[0] = (p[2] * 255 + half) / a; + p[1] = (p[1] * 255 + half) / a; + p[2] = ( t * 255 + half) / a; + } else { + p[0] = p[2]; + p[2] = t; + } + p += 4; + } + } else { + // convert bgr to rgb + for (i=0; i < pixel_count; ++i) { + stbi_uc t = p[0]; + p[0] = p[2]; + p[2] = t; + p += 4; + } + } + } +} + +#define STBI__PNG_TYPE(a,b,c,d) (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d)) + +static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp) +{ + stbi_uc palette[1024], pal_img_n=0; + stbi_uc has_trans=0, tc[3]={0}; + stbi__uint16 tc16[3]; + stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0; + int first=1,k,interlace=0, color=0, is_iphone=0; + stbi__context *s = z->s; + + z->expanded = NULL; + z->idata = NULL; + z->out = NULL; + + if (!stbi__check_png_header(s)) return 0; + + if (scan == STBI__SCAN_type) return 1; + + for (;;) { + stbi__pngchunk c = stbi__get_chunk_header(s); + switch (c.type) { + case STBI__PNG_TYPE('C','g','B','I'): + is_iphone = 1; + stbi__skip(s, c.length); + break; + case STBI__PNG_TYPE('I','H','D','R'): { + int comp,filter; + if (!first) return stbi__err("multiple IHDR","Corrupt PNG"); + first = 0; + if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG"); + s->img_x = stbi__get32be(s); + s->img_y = stbi__get32be(s); + if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)"); + if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)"); + z->depth = stbi__get8(s); if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16) return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only"); + color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG"); + if (color == 3 && z->depth == 16) return stbi__err("bad ctype","Corrupt PNG"); + if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG"); + comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG"); + filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG"); + interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG"); + if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG"); + if (!pal_img_n) { + s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0); + if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode"); + } else { + // if paletted, then pal_n is our final components, and + // img_n is # components to decompress/filter. + s->img_n = 1; + if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG"); + } + // even with SCAN_header, have to scan to see if we have a tRNS + break; + } + + case STBI__PNG_TYPE('P','L','T','E'): { + if (first) return stbi__err("first not IHDR", "Corrupt PNG"); + if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG"); + pal_len = c.length / 3; + if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG"); + for (i=0; i < pal_len; ++i) { + palette[i*4+0] = stbi__get8(s); + palette[i*4+1] = stbi__get8(s); + palette[i*4+2] = stbi__get8(s); + palette[i*4+3] = 255; + } + break; + } + + case STBI__PNG_TYPE('t','R','N','S'): { + if (first) return stbi__err("first not IHDR", "Corrupt PNG"); + if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG"); + if (pal_img_n) { + if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; } + if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG"); + if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG"); + pal_img_n = 4; + for (i=0; i < c.length; ++i) + palette[i*4+3] = stbi__get8(s); + } else { + if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG"); + if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG"); + has_trans = 1; + // non-paletted with tRNS = constant alpha. if header-scanning, we can stop now. + if (scan == STBI__SCAN_header) { ++s->img_n; return 1; } + if (z->depth == 16) { + for (k = 0; k < s->img_n; ++k) tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is + } else { + for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger + } + } + break; + } + + case STBI__PNG_TYPE('I','D','A','T'): { + if (first) return stbi__err("first not IHDR", "Corrupt PNG"); + if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG"); + if (scan == STBI__SCAN_header) { + // header scan definitely stops at first IDAT + if (pal_img_n) + s->img_n = pal_img_n; + return 1; + } + if (c.length > (1u << 30)) return stbi__err("IDAT size limit", "IDAT section larger than 2^30 bytes"); + if ((int)(ioff + c.length) < (int)ioff) return 0; + if (ioff + c.length > idata_limit) { + stbi__uint32 idata_limit_old = idata_limit; + stbi_uc *p; + if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096; + while (ioff + c.length > idata_limit) + idata_limit *= 2; + STBI_NOTUSED(idata_limit_old); + p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory"); + z->idata = p; + } + if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG"); + ioff += c.length; + break; + } + + case STBI__PNG_TYPE('I','E','N','D'): { + stbi__uint32 raw_len, bpl; + if (first) return stbi__err("first not IHDR", "Corrupt PNG"); + if (scan != STBI__SCAN_load) return 1; + if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG"); + // initial guess for decoded data size to avoid unnecessary reallocs + bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component + raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */; + z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone); + if (z->expanded == NULL) return 0; // zlib should set error + STBI_FREE(z->idata); z->idata = NULL; + if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans) + s->img_out_n = s->img_n+1; + else + s->img_out_n = s->img_n; + if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0; + if (has_trans) { + if (z->depth == 16) { + if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0; + } else { + if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0; + } + } + if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2) + stbi__de_iphone(z); + if (pal_img_n) { + // pal_img_n == 3 or 4 + s->img_n = pal_img_n; // record the actual colors we had + s->img_out_n = pal_img_n; + if (req_comp >= 3) s->img_out_n = req_comp; + if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n)) + return 0; + } else if (has_trans) { + // non-paletted image with tRNS -> source image has (constant) alpha + ++s->img_n; + } + STBI_FREE(z->expanded); z->expanded = NULL; + // end of PNG chunk, read and skip CRC + stbi__get32be(s); + return 1; + } + + default: + // if critical, fail + if (first) return stbi__err("first not IHDR", "Corrupt PNG"); + if ((c.type & (1 << 29)) == 0) { + #ifndef STBI_NO_FAILURE_STRINGS + // not threadsafe + static char invalid_chunk[] = "XXXX PNG chunk not known"; + invalid_chunk[0] = STBI__BYTECAST(c.type >> 24); + invalid_chunk[1] = STBI__BYTECAST(c.type >> 16); + invalid_chunk[2] = STBI__BYTECAST(c.type >> 8); + invalid_chunk[3] = STBI__BYTECAST(c.type >> 0); + #endif + return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type"); + } + stbi__skip(s, c.length); + break; + } + // end of PNG chunk, read and skip CRC + stbi__get32be(s); + } +} + +static void *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp, stbi__result_info *ri) +{ + void *result=NULL; + if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error"); + if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) { + if (p->depth <= 8) + ri->bits_per_channel = 8; + else if (p->depth == 16) + ri->bits_per_channel = 16; + else + return stbi__errpuc("bad bits_per_channel", "PNG not supported: unsupported color depth"); + result = p->out; + p->out = NULL; + if (req_comp && req_comp != p->s->img_out_n) { + if (ri->bits_per_channel == 8) + result = stbi__convert_format((unsigned char *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y); + else + result = stbi__convert_format16((stbi__uint16 *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y); + p->s->img_out_n = req_comp; + if (result == NULL) return result; + } + *x = p->s->img_x; + *y = p->s->img_y; + if (n) *n = p->s->img_n; + } + STBI_FREE(p->out); p->out = NULL; + STBI_FREE(p->expanded); p->expanded = NULL; + STBI_FREE(p->idata); p->idata = NULL; + + return result; +} + +static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri) +{ + stbi__png p; + p.s = s; + return stbi__do_png(&p, x,y,comp,req_comp, ri); +} + +static int stbi__png_test(stbi__context *s) +{ + int r; + r = stbi__check_png_header(s); + stbi__rewind(s); + return r; +} + +static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp) +{ + if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) { + stbi__rewind( p->s ); + return 0; + } + if (x) *x = p->s->img_x; + if (y) *y = p->s->img_y; + if (comp) *comp = p->s->img_n; + return 1; +} + +static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp) +{ + stbi__png p; + p.s = s; + return stbi__png_info_raw(&p, x, y, comp); +} + +static int stbi__png_is16(stbi__context *s) +{ + stbi__png p; + p.s = s; + if (!stbi__png_info_raw(&p, NULL, NULL, NULL)) + return 0; + if (p.depth != 16) { + stbi__rewind(p.s); + return 0; + } + return 1; +} +#endif + +// Microsoft/Windows BMP image + +#ifndef STBI_NO_BMP +static int stbi__bmp_test_raw(stbi__context *s) +{ + int r; + int sz; + if (stbi__get8(s) != 'B') return 0; + if (stbi__get8(s) != 'M') return 0; + stbi__get32le(s); // discard filesize + stbi__get16le(s); // discard reserved + stbi__get16le(s); // discard reserved + stbi__get32le(s); // discard data offset + sz = stbi__get32le(s); + r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124); + return r; +} + +static int stbi__bmp_test(stbi__context *s) +{ + int r = stbi__bmp_test_raw(s); + stbi__rewind(s); + return r; +} + + +// returns 0..31 for the highest set bit +static int stbi__high_bit(unsigned int z) +{ + int n=0; + if (z == 0) return -1; + if (z >= 0x10000) { n += 16; z >>= 16; } + if (z >= 0x00100) { n += 8; z >>= 8; } + if (z >= 0x00010) { n += 4; z >>= 4; } + if (z >= 0x00004) { n += 2; z >>= 2; } + if (z >= 0x00002) { n += 1;/* >>= 1;*/ } + return n; +} + +static int stbi__bitcount(unsigned int a) +{ + a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2 + a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4 + a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits + a = (a + (a >> 8)); // max 16 per 8 bits + a = (a + (a >> 16)); // max 32 per 8 bits + return a & 0xff; +} + +// extract an arbitrarily-aligned N-bit value (N=bits) +// from v, and then make it 8-bits long and fractionally +// extend it to full full range. +static int stbi__shiftsigned(unsigned int v, int shift, int bits) +{ + static unsigned int mul_table[9] = { + 0, + 0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/, + 0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/, + }; + static unsigned int shift_table[9] = { + 0, 0,0,1,0,2,4,6,0, + }; + if (shift < 0) + v <<= -shift; + else + v >>= shift; + STBI_ASSERT(v < 256); + v >>= (8-bits); + STBI_ASSERT(bits >= 0 && bits <= 8); + return (int) ((unsigned) v * mul_table[bits]) >> shift_table[bits]; +} + +typedef struct +{ + int bpp, offset, hsz; + unsigned int mr,mg,mb,ma, all_a; + int extra_read; +} stbi__bmp_data; + +static int stbi__bmp_set_mask_defaults(stbi__bmp_data *info, int compress) +{ + // BI_BITFIELDS specifies masks explicitly, don't override + if (compress == 3) + return 1; + + if (compress == 0) { + if (info->bpp == 16) { + info->mr = 31u << 10; + info->mg = 31u << 5; + info->mb = 31u << 0; + } else if (info->bpp == 32) { + info->mr = 0xffu << 16; + info->mg = 0xffu << 8; + info->mb = 0xffu << 0; + info->ma = 0xffu << 24; + info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0 + } else { + // otherwise, use defaults, which is all-0 + info->mr = info->mg = info->mb = info->ma = 0; + } + return 1; + } + return 0; // error +} + +static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info) +{ + int hsz; + if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP"); + stbi__get32le(s); // discard filesize + stbi__get16le(s); // discard reserved + stbi__get16le(s); // discard reserved + info->offset = stbi__get32le(s); + info->hsz = hsz = stbi__get32le(s); + info->mr = info->mg = info->mb = info->ma = 0; + info->extra_read = 14; + + if (info->offset < 0) return stbi__errpuc("bad BMP", "bad BMP"); + + if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown"); + if (hsz == 12) { + s->img_x = stbi__get16le(s); + s->img_y = stbi__get16le(s); + } else { + s->img_x = stbi__get32le(s); + s->img_y = stbi__get32le(s); + } + if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP"); + info->bpp = stbi__get16le(s); + if (hsz != 12) { + int compress = stbi__get32le(s); + if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE"); + if (compress >= 4) return stbi__errpuc("BMP JPEG/PNG", "BMP type not supported: unsupported compression"); // this includes PNG/JPEG modes + if (compress == 3 && info->bpp != 16 && info->bpp != 32) return stbi__errpuc("bad BMP", "bad BMP"); // bitfields requires 16 or 32 bits/pixel + stbi__get32le(s); // discard sizeof + stbi__get32le(s); // discard hres + stbi__get32le(s); // discard vres + stbi__get32le(s); // discard colorsused + stbi__get32le(s); // discard max important + if (hsz == 40 || hsz == 56) { + if (hsz == 56) { + stbi__get32le(s); + stbi__get32le(s); + stbi__get32le(s); + stbi__get32le(s); + } + if (info->bpp == 16 || info->bpp == 32) { + if (compress == 0) { + stbi__bmp_set_mask_defaults(info, compress); + } else if (compress == 3) { + info->mr = stbi__get32le(s); + info->mg = stbi__get32le(s); + info->mb = stbi__get32le(s); + info->extra_read += 12; + // not documented, but generated by photoshop and handled by mspaint + if (info->mr == info->mg && info->mg == info->mb) { + // ?!?!? + return stbi__errpuc("bad BMP", "bad BMP"); + } + } else + return stbi__errpuc("bad BMP", "bad BMP"); + } + } else { + // V4/V5 header + int i; + if (hsz != 108 && hsz != 124) + return stbi__errpuc("bad BMP", "bad BMP"); + info->mr = stbi__get32le(s); + info->mg = stbi__get32le(s); + info->mb = stbi__get32le(s); + info->ma = stbi__get32le(s); + if (compress != 3) // override mr/mg/mb unless in BI_BITFIELDS mode, as per docs + stbi__bmp_set_mask_defaults(info, compress); + stbi__get32le(s); // discard color space + for (i=0; i < 12; ++i) + stbi__get32le(s); // discard color space parameters + if (hsz == 124) { + stbi__get32le(s); // discard rendering intent + stbi__get32le(s); // discard offset of profile data + stbi__get32le(s); // discard size of profile data + stbi__get32le(s); // discard reserved + } + } + } + return (void *) 1; +} + + +static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri) +{ + stbi_uc *out; + unsigned int mr=0,mg=0,mb=0,ma=0, all_a; + stbi_uc pal[256][4]; + int psize=0,i,j,width; + int flip_vertically, pad, target; + stbi__bmp_data info; + STBI_NOTUSED(ri); + + info.all_a = 255; + if (stbi__bmp_parse_header(s, &info) == NULL) + return NULL; // error code already set + + flip_vertically = ((int) s->img_y) > 0; + s->img_y = abs((int) s->img_y); + + if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + + mr = info.mr; + mg = info.mg; + mb = info.mb; + ma = info.ma; + all_a = info.all_a; + + if (info.hsz == 12) { + if (info.bpp < 24) + psize = (info.offset - info.extra_read - 24) / 3; + } else { + if (info.bpp < 16) + psize = (info.offset - info.extra_read - info.hsz) >> 2; + } + if (psize == 0) { + // accept some number of extra bytes after the header, but if the offset points either to before + // the header ends or implies a large amount of extra data, reject the file as malformed + int bytes_read_so_far = s->callback_already_read + (int)(s->img_buffer - s->img_buffer_original); + int header_limit = 1024; // max we actually read is below 256 bytes currently. + int extra_data_limit = 256*4; // what ordinarily goes here is a palette; 256 entries*4 bytes is its max size. + if (bytes_read_so_far <= 0 || bytes_read_so_far > header_limit) { + return stbi__errpuc("bad header", "Corrupt BMP"); + } + // we established that bytes_read_so_far is positive and sensible. + // the first half of this test rejects offsets that are either too small positives, or + // negative, and guarantees that info.offset >= bytes_read_so_far > 0. this in turn + // ensures the number computed in the second half of the test can't overflow. + if (info.offset < bytes_read_so_far || info.offset - bytes_read_so_far > extra_data_limit) { + return stbi__errpuc("bad offset", "Corrupt BMP"); + } else { + stbi__skip(s, info.offset - bytes_read_so_far); + } + } + + if (info.bpp == 24 && ma == 0xff000000) + s->img_n = 3; + else + s->img_n = ma ? 4 : 3; + if (req_comp && req_comp >= 3) // we can directly decode 3 or 4 + target = req_comp; + else + target = s->img_n; // if they want monochrome, we'll post-convert + + // sanity-check size + if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0)) + return stbi__errpuc("too large", "Corrupt BMP"); + + out = (stbi_uc *) stbi__malloc_mad3(target, s->img_x, s->img_y, 0); + if (!out) return stbi__errpuc("outofmem", "Out of memory"); + if (info.bpp < 16) { + int z=0; + if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); } + for (i=0; i < psize; ++i) { + pal[i][2] = stbi__get8(s); + pal[i][1] = stbi__get8(s); + pal[i][0] = stbi__get8(s); + if (info.hsz != 12) stbi__get8(s); + pal[i][3] = 255; + } + stbi__skip(s, info.offset - info.extra_read - info.hsz - psize * (info.hsz == 12 ? 3 : 4)); + if (info.bpp == 1) width = (s->img_x + 7) >> 3; + else if (info.bpp == 4) width = (s->img_x + 1) >> 1; + else if (info.bpp == 8) width = s->img_x; + else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); } + pad = (-width)&3; + if (info.bpp == 1) { + for (j=0; j < (int) s->img_y; ++j) { + int bit_offset = 7, v = stbi__get8(s); + for (i=0; i < (int) s->img_x; ++i) { + int color = (v>>bit_offset)&0x1; + out[z++] = pal[color][0]; + out[z++] = pal[color][1]; + out[z++] = pal[color][2]; + if (target == 4) out[z++] = 255; + if (i+1 == (int) s->img_x) break; + if((--bit_offset) < 0) { + bit_offset = 7; + v = stbi__get8(s); + } + } + stbi__skip(s, pad); + } + } else { + for (j=0; j < (int) s->img_y; ++j) { + for (i=0; i < (int) s->img_x; i += 2) { + int v=stbi__get8(s),v2=0; + if (info.bpp == 4) { + v2 = v & 15; + v >>= 4; + } + out[z++] = pal[v][0]; + out[z++] = pal[v][1]; + out[z++] = pal[v][2]; + if (target == 4) out[z++] = 255; + if (i+1 == (int) s->img_x) break; + v = (info.bpp == 8) ? stbi__get8(s) : v2; + out[z++] = pal[v][0]; + out[z++] = pal[v][1]; + out[z++] = pal[v][2]; + if (target == 4) out[z++] = 255; + } + stbi__skip(s, pad); + } + } + } else { + int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0; + int z = 0; + int easy=0; + stbi__skip(s, info.offset - info.extra_read - info.hsz); + if (info.bpp == 24) width = 3 * s->img_x; + else if (info.bpp == 16) width = 2*s->img_x; + else /* bpp = 32 and pad = 0 */ width=0; + pad = (-width) & 3; + if (info.bpp == 24) { + easy = 1; + } else if (info.bpp == 32) { + if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000) + easy = 2; + } + if (!easy) { + if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); } + // right shift amt to put high bit in position #7 + rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr); + gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg); + bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb); + ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma); + if (rcount > 8 || gcount > 8 || bcount > 8 || acount > 8) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); } + } + for (j=0; j < (int) s->img_y; ++j) { + if (easy) { + for (i=0; i < (int) s->img_x; ++i) { + unsigned char a; + out[z+2] = stbi__get8(s); + out[z+1] = stbi__get8(s); + out[z+0] = stbi__get8(s); + z += 3; + a = (easy == 2 ? stbi__get8(s) : 255); + all_a |= a; + if (target == 4) out[z++] = a; + } + } else { + int bpp = info.bpp; + for (i=0; i < (int) s->img_x; ++i) { + stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s)); + unsigned int a; + out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount)); + out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount)); + out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount)); + a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255); + all_a |= a; + if (target == 4) out[z++] = STBI__BYTECAST(a); + } + } + stbi__skip(s, pad); + } + } + + // if alpha channel is all 0s, replace with all 255s + if (target == 4 && all_a == 0) + for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4) + out[i] = 255; + + if (flip_vertically) { + stbi_uc t; + for (j=0; j < (int) s->img_y>>1; ++j) { + stbi_uc *p1 = out + j *s->img_x*target; + stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target; + for (i=0; i < (int) s->img_x*target; ++i) { + t = p1[i]; p1[i] = p2[i]; p2[i] = t; + } + } + } + + if (req_comp && req_comp != target) { + out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y); + if (out == NULL) return out; // stbi__convert_format frees input on failure + } + + *x = s->img_x; + *y = s->img_y; + if (comp) *comp = s->img_n; + return out; +} +#endif + +// Targa Truevision - TGA +// by Jonathan Dummer +#ifndef STBI_NO_TGA +// returns STBI_rgb or whatever, 0 on error +static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16) +{ + // only RGB or RGBA (incl. 16bit) or grey allowed + if (is_rgb16) *is_rgb16 = 0; + switch(bits_per_pixel) { + case 8: return STBI_grey; + case 16: if(is_grey) return STBI_grey_alpha; + // fallthrough + case 15: if(is_rgb16) *is_rgb16 = 1; + return STBI_rgb; + case 24: // fallthrough + case 32: return bits_per_pixel/8; + default: return 0; + } +} + +static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp) +{ + int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp; + int sz, tga_colormap_type; + stbi__get8(s); // discard Offset + tga_colormap_type = stbi__get8(s); // colormap type + if( tga_colormap_type > 1 ) { + stbi__rewind(s); + return 0; // only RGB or indexed allowed + } + tga_image_type = stbi__get8(s); // image type + if ( tga_colormap_type == 1 ) { // colormapped (paletted) image + if (tga_image_type != 1 && tga_image_type != 9) { + stbi__rewind(s); + return 0; + } + stbi__skip(s,4); // skip index of first colormap entry and number of entries + sz = stbi__get8(s); // check bits per palette color entry + if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) { + stbi__rewind(s); + return 0; + } + stbi__skip(s,4); // skip image x and y origin + tga_colormap_bpp = sz; + } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE + if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) { + stbi__rewind(s); + return 0; // only RGB or grey allowed, +/- RLE + } + stbi__skip(s,9); // skip colormap specification and image x/y origin + tga_colormap_bpp = 0; + } + tga_w = stbi__get16le(s); + if( tga_w < 1 ) { + stbi__rewind(s); + return 0; // test width + } + tga_h = stbi__get16le(s); + if( tga_h < 1 ) { + stbi__rewind(s); + return 0; // test height + } + tga_bits_per_pixel = stbi__get8(s); // bits per pixel + stbi__get8(s); // ignore alpha bits + if (tga_colormap_bpp != 0) { + if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) { + // when using a colormap, tga_bits_per_pixel is the size of the indexes + // I don't think anything but 8 or 16bit indexes makes sense + stbi__rewind(s); + return 0; + } + tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL); + } else { + tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL); + } + if(!tga_comp) { + stbi__rewind(s); + return 0; + } + if (x) *x = tga_w; + if (y) *y = tga_h; + if (comp) *comp = tga_comp; + return 1; // seems to have passed everything +} + +static int stbi__tga_test(stbi__context *s) +{ + int res = 0; + int sz, tga_color_type; + stbi__get8(s); // discard Offset + tga_color_type = stbi__get8(s); // color type + if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed + sz = stbi__get8(s); // image type + if ( tga_color_type == 1 ) { // colormapped (paletted) image + if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9 + stbi__skip(s,4); // skip index of first colormap entry and number of entries + sz = stbi__get8(s); // check bits per palette color entry + if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd; + stbi__skip(s,4); // skip image x and y origin + } else { // "normal" image w/o colormap + if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE + stbi__skip(s,9); // skip colormap specification and image x/y origin + } + if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width + if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height + sz = stbi__get8(s); // bits per pixel + if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index + if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd; + + res = 1; // if we got this far, everything's good and we can return 1 instead of 0 + +errorEnd: + stbi__rewind(s); + return res; +} + +// read 16bit value and convert to 24bit RGB +static void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out) +{ + stbi__uint16 px = (stbi__uint16)stbi__get16le(s); + stbi__uint16 fiveBitMask = 31; + // we have 3 channels with 5bits each + int r = (px >> 10) & fiveBitMask; + int g = (px >> 5) & fiveBitMask; + int b = px & fiveBitMask; + // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later + out[0] = (stbi_uc)((r * 255)/31); + out[1] = (stbi_uc)((g * 255)/31); + out[2] = (stbi_uc)((b * 255)/31); + + // some people claim that the most significant bit might be used for alpha + // (possibly if an alpha-bit is set in the "image descriptor byte") + // but that only made 16bit test images completely translucent.. + // so let's treat all 15 and 16bit TGAs as RGB with no alpha. +} + +static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri) +{ + // read in the TGA header stuff + int tga_offset = stbi__get8(s); + int tga_indexed = stbi__get8(s); + int tga_image_type = stbi__get8(s); + int tga_is_RLE = 0; + int tga_palette_start = stbi__get16le(s); + int tga_palette_len = stbi__get16le(s); + int tga_palette_bits = stbi__get8(s); + int tga_x_origin = stbi__get16le(s); + int tga_y_origin = stbi__get16le(s); + int tga_width = stbi__get16le(s); + int tga_height = stbi__get16le(s); + int tga_bits_per_pixel = stbi__get8(s); + int tga_comp, tga_rgb16=0; + int tga_inverted = stbi__get8(s); + // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?) + // image data + unsigned char *tga_data; + unsigned char *tga_palette = NULL; + int i, j; + unsigned char raw_data[4] = {0}; + int RLE_count = 0; + int RLE_repeating = 0; + int read_next_pixel = 1; + STBI_NOTUSED(ri); + STBI_NOTUSED(tga_x_origin); // @TODO + STBI_NOTUSED(tga_y_origin); // @TODO + + if (tga_height > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + if (tga_width > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + + // do a tiny bit of precessing + if ( tga_image_type >= 8 ) + { + tga_image_type -= 8; + tga_is_RLE = 1; + } + tga_inverted = 1 - ((tga_inverted >> 5) & 1); + + // If I'm paletted, then I'll use the number of bits from the palette + if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16); + else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16); + + if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency + return stbi__errpuc("bad format", "Can't find out TGA pixelformat"); + + // tga info + *x = tga_width; + *y = tga_height; + if (comp) *comp = tga_comp; + + if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0)) + return stbi__errpuc("too large", "Corrupt TGA"); + + tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0); + if (!tga_data) return stbi__errpuc("outofmem", "Out of memory"); + + // skip to the data's starting position (offset usually = 0) + stbi__skip(s, tga_offset ); + + if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) { + for (i=0; i < tga_height; ++i) { + int row = tga_inverted ? tga_height -i - 1 : i; + stbi_uc *tga_row = tga_data + row*tga_width*tga_comp; + stbi__getn(s, tga_row, tga_width * tga_comp); + } + } else { + // do I need to load a palette? + if ( tga_indexed) + { + if (tga_palette_len == 0) { /* you have to have at least one entry! */ + STBI_FREE(tga_data); + return stbi__errpuc("bad palette", "Corrupt TGA"); + } + + // any data to skip? (offset usually = 0) + stbi__skip(s, tga_palette_start ); + // load the palette + tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0); + if (!tga_palette) { + STBI_FREE(tga_data); + return stbi__errpuc("outofmem", "Out of memory"); + } + if (tga_rgb16) { + stbi_uc *pal_entry = tga_palette; + STBI_ASSERT(tga_comp == STBI_rgb); + for (i=0; i < tga_palette_len; ++i) { + stbi__tga_read_rgb16(s, pal_entry); + pal_entry += tga_comp; + } + } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) { + STBI_FREE(tga_data); + STBI_FREE(tga_palette); + return stbi__errpuc("bad palette", "Corrupt TGA"); + } + } + // load the data + for (i=0; i < tga_width * tga_height; ++i) + { + // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk? + if ( tga_is_RLE ) + { + if ( RLE_count == 0 ) + { + // yep, get the next byte as a RLE command + int RLE_cmd = stbi__get8(s); + RLE_count = 1 + (RLE_cmd & 127); + RLE_repeating = RLE_cmd >> 7; + read_next_pixel = 1; + } else if ( !RLE_repeating ) + { + read_next_pixel = 1; + } + } else + { + read_next_pixel = 1; + } + // OK, if I need to read a pixel, do it now + if ( read_next_pixel ) + { + // load however much data we did have + if ( tga_indexed ) + { + // read in index, then perform the lookup + int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s); + if ( pal_idx >= tga_palette_len ) { + // invalid index + pal_idx = 0; + } + pal_idx *= tga_comp; + for (j = 0; j < tga_comp; ++j) { + raw_data[j] = tga_palette[pal_idx+j]; + } + } else if(tga_rgb16) { + STBI_ASSERT(tga_comp == STBI_rgb); + stbi__tga_read_rgb16(s, raw_data); + } else { + // read in the data raw + for (j = 0; j < tga_comp; ++j) { + raw_data[j] = stbi__get8(s); + } + } + // clear the reading flag for the next pixel + read_next_pixel = 0; + } // end of reading a pixel + + // copy data + for (j = 0; j < tga_comp; ++j) + tga_data[i*tga_comp+j] = raw_data[j]; + + // in case we're in RLE mode, keep counting down + --RLE_count; + } + // do I need to invert the image? + if ( tga_inverted ) + { + for (j = 0; j*2 < tga_height; ++j) + { + int index1 = j * tga_width * tga_comp; + int index2 = (tga_height - 1 - j) * tga_width * tga_comp; + for (i = tga_width * tga_comp; i > 0; --i) + { + unsigned char temp = tga_data[index1]; + tga_data[index1] = tga_data[index2]; + tga_data[index2] = temp; + ++index1; + ++index2; + } + } + } + // clear my palette, if I had one + if ( tga_palette != NULL ) + { + STBI_FREE( tga_palette ); + } + } + + // swap RGB - if the source data was RGB16, it already is in the right order + if (tga_comp >= 3 && !tga_rgb16) + { + unsigned char* tga_pixel = tga_data; + for (i=0; i < tga_width * tga_height; ++i) + { + unsigned char temp = tga_pixel[0]; + tga_pixel[0] = tga_pixel[2]; + tga_pixel[2] = temp; + tga_pixel += tga_comp; + } + } + + // convert to target component count + if (req_comp && req_comp != tga_comp) + tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height); + + // the things I do to get rid of an error message, and yet keep + // Microsoft's C compilers happy... [8^( + tga_palette_start = tga_palette_len = tga_palette_bits = + tga_x_origin = tga_y_origin = 0; + STBI_NOTUSED(tga_palette_start); + // OK, done + return tga_data; +} +#endif + +// ************************************************************************************************* +// Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB + +#ifndef STBI_NO_PSD +static int stbi__psd_test(stbi__context *s) +{ + int r = (stbi__get32be(s) == 0x38425053); + stbi__rewind(s); + return r; +} + +static int stbi__psd_decode_rle(stbi__context *s, stbi_uc *p, int pixelCount) +{ + int count, nleft, len; + + count = 0; + while ((nleft = pixelCount - count) > 0) { + len = stbi__get8(s); + if (len == 128) { + // No-op. + } else if (len < 128) { + // Copy next len+1 bytes literally. + len++; + if (len > nleft) return 0; // corrupt data + count += len; + while (len) { + *p = stbi__get8(s); + p += 4; + len--; + } + } else if (len > 128) { + stbi_uc val; + // Next -len+1 bytes in the dest are replicated from next source byte. + // (Interpret len as a negative 8-bit int.) + len = 257 - len; + if (len > nleft) return 0; // corrupt data + val = stbi__get8(s); + count += len; + while (len) { + *p = val; + p += 4; + len--; + } + } + } + + return 1; +} + +static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc) +{ + int pixelCount; + int channelCount, compression; + int channel, i; + int bitdepth; + int w,h; + stbi_uc *out; + STBI_NOTUSED(ri); + + // Check identifier + if (stbi__get32be(s) != 0x38425053) // "8BPS" + return stbi__errpuc("not PSD", "Corrupt PSD image"); + + // Check file type version. + if (stbi__get16be(s) != 1) + return stbi__errpuc("wrong version", "Unsupported version of PSD image"); + + // Skip 6 reserved bytes. + stbi__skip(s, 6 ); + + // Read the number of channels (R, G, B, A, etc). + channelCount = stbi__get16be(s); + if (channelCount < 0 || channelCount > 16) + return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image"); + + // Read the rows and columns of the image. + h = stbi__get32be(s); + w = stbi__get32be(s); + + if (h > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + if (w > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + + // Make sure the depth is 8 bits. + bitdepth = stbi__get16be(s); + if (bitdepth != 8 && bitdepth != 16) + return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit"); + + // Make sure the color mode is RGB. + // Valid options are: + // 0: Bitmap + // 1: Grayscale + // 2: Indexed color + // 3: RGB color + // 4: CMYK color + // 7: Multichannel + // 8: Duotone + // 9: Lab color + if (stbi__get16be(s) != 3) + return stbi__errpuc("wrong color format", "PSD is not in RGB color format"); + + // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.) + stbi__skip(s,stbi__get32be(s) ); + + // Skip the image resources. (resolution, pen tool paths, etc) + stbi__skip(s, stbi__get32be(s) ); + + // Skip the reserved data. + stbi__skip(s, stbi__get32be(s) ); + + // Find out if the data is compressed. + // Known values: + // 0: no compression + // 1: RLE compressed + compression = stbi__get16be(s); + if (compression > 1) + return stbi__errpuc("bad compression", "PSD has an unknown compression format"); + + // Check size + if (!stbi__mad3sizes_valid(4, w, h, 0)) + return stbi__errpuc("too large", "Corrupt PSD"); + + // Create the destination image. + + if (!compression && bitdepth == 16 && bpc == 16) { + out = (stbi_uc *) stbi__malloc_mad3(8, w, h, 0); + ri->bits_per_channel = 16; + } else + out = (stbi_uc *) stbi__malloc(4 * w*h); + + if (!out) return stbi__errpuc("outofmem", "Out of memory"); + pixelCount = w*h; + + // Initialize the data to zero. + //memset( out, 0, pixelCount * 4 ); + + // Finally, the image data. + if (compression) { + // RLE as used by .PSD and .TIFF + // Loop until you get the number of unpacked bytes you are expecting: + // Read the next source byte into n. + // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally. + // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times. + // Else if n is 128, noop. + // Endloop + + // The RLE-compressed data is preceded by a 2-byte data count for each row in the data, + // which we're going to just skip. + stbi__skip(s, h * channelCount * 2 ); + + // Read the RLE data by channel. + for (channel = 0; channel < 4; channel++) { + stbi_uc *p; + + p = out+channel; + if (channel >= channelCount) { + // Fill this channel with default data. + for (i = 0; i < pixelCount; i++, p += 4) + *p = (channel == 3 ? 255 : 0); + } else { + // Read the RLE data. + if (!stbi__psd_decode_rle(s, p, pixelCount)) { + STBI_FREE(out); + return stbi__errpuc("corrupt", "bad RLE data"); + } + } + } + + } else { + // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...) + // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image. + + // Read the data by channel. + for (channel = 0; channel < 4; channel++) { + if (channel >= channelCount) { + // Fill this channel with default data. + if (bitdepth == 16 && bpc == 16) { + stbi__uint16 *q = ((stbi__uint16 *) out) + channel; + stbi__uint16 val = channel == 3 ? 65535 : 0; + for (i = 0; i < pixelCount; i++, q += 4) + *q = val; + } else { + stbi_uc *p = out+channel; + stbi_uc val = channel == 3 ? 255 : 0; + for (i = 0; i < pixelCount; i++, p += 4) + *p = val; + } + } else { + if (ri->bits_per_channel == 16) { // output bpc + stbi__uint16 *q = ((stbi__uint16 *) out) + channel; + for (i = 0; i < pixelCount; i++, q += 4) + *q = (stbi__uint16) stbi__get16be(s); + } else { + stbi_uc *p = out+channel; + if (bitdepth == 16) { // input bpc + for (i = 0; i < pixelCount; i++, p += 4) + *p = (stbi_uc) (stbi__get16be(s) >> 8); + } else { + for (i = 0; i < pixelCount; i++, p += 4) + *p = stbi__get8(s); + } + } + } + } + } + + // remove weird white matte from PSD + if (channelCount >= 4) { + if (ri->bits_per_channel == 16) { + for (i=0; i < w*h; ++i) { + stbi__uint16 *pixel = (stbi__uint16 *) out + 4*i; + if (pixel[3] != 0 && pixel[3] != 65535) { + float a = pixel[3] / 65535.0f; + float ra = 1.0f / a; + float inv_a = 65535.0f * (1 - ra); + pixel[0] = (stbi__uint16) (pixel[0]*ra + inv_a); + pixel[1] = (stbi__uint16) (pixel[1]*ra + inv_a); + pixel[2] = (stbi__uint16) (pixel[2]*ra + inv_a); + } + } + } else { + for (i=0; i < w*h; ++i) { + unsigned char *pixel = out + 4*i; + if (pixel[3] != 0 && pixel[3] != 255) { + float a = pixel[3] / 255.0f; + float ra = 1.0f / a; + float inv_a = 255.0f * (1 - ra); + pixel[0] = (unsigned char) (pixel[0]*ra + inv_a); + pixel[1] = (unsigned char) (pixel[1]*ra + inv_a); + pixel[2] = (unsigned char) (pixel[2]*ra + inv_a); + } + } + } + } + + // convert to desired output format + if (req_comp && req_comp != 4) { + if (ri->bits_per_channel == 16) + out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, 4, req_comp, w, h); + else + out = stbi__convert_format(out, 4, req_comp, w, h); + if (out == NULL) return out; // stbi__convert_format frees input on failure + } + + if (comp) *comp = 4; + *y = h; + *x = w; + + return out; +} +#endif + +// ************************************************************************************************* +// Softimage PIC loader +// by Tom Seddon +// +// See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format +// See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/ + +#ifndef STBI_NO_PIC +static int stbi__pic_is4(stbi__context *s,const char *str) +{ + int i; + for (i=0; i<4; ++i) + if (stbi__get8(s) != (stbi_uc)str[i]) + return 0; + + return 1; +} + +static int stbi__pic_test_core(stbi__context *s) +{ + int i; + + if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) + return 0; + + for(i=0;i<84;++i) + stbi__get8(s); + + if (!stbi__pic_is4(s,"PICT")) + return 0; + + return 1; +} + +typedef struct +{ + stbi_uc size,type,channel; +} stbi__pic_packet; + +static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest) +{ + int mask=0x80, i; + + for (i=0; i<4; ++i, mask>>=1) { + if (channel & mask) { + if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short"); + dest[i]=stbi__get8(s); + } + } + + return dest; +} + +static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src) +{ + int mask=0x80,i; + + for (i=0;i<4; ++i, mask>>=1) + if (channel&mask) + dest[i]=src[i]; +} + +static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result) +{ + int act_comp=0,num_packets=0,y,chained; + stbi__pic_packet packets[10]; + + // this will (should...) cater for even some bizarre stuff like having data + // for the same channel in multiple packets. + do { + stbi__pic_packet *packet; + + if (num_packets==sizeof(packets)/sizeof(packets[0])) + return stbi__errpuc("bad format","too many packets"); + + packet = &packets[num_packets++]; + + chained = stbi__get8(s); + packet->size = stbi__get8(s); + packet->type = stbi__get8(s); + packet->channel = stbi__get8(s); + + act_comp |= packet->channel; + + if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)"); + if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp"); + } while (chained); + + *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel? + + for(y=0; ytype) { + default: + return stbi__errpuc("bad format","packet has bad compression type"); + + case 0: {//uncompressed + int x; + + for(x=0;xchannel,dest)) + return 0; + break; + } + + case 1://Pure RLE + { + int left=width, i; + + while (left>0) { + stbi_uc count,value[4]; + + count=stbi__get8(s); + if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)"); + + if (count > left) + count = (stbi_uc) left; + + if (!stbi__readval(s,packet->channel,value)) return 0; + + for(i=0; ichannel,dest,value); + left -= count; + } + } + break; + + case 2: {//Mixed RLE + int left=width; + while (left>0) { + int count = stbi__get8(s), i; + if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)"); + + if (count >= 128) { // Repeated + stbi_uc value[4]; + + if (count==128) + count = stbi__get16be(s); + else + count -= 127; + if (count > left) + return stbi__errpuc("bad file","scanline overrun"); + + if (!stbi__readval(s,packet->channel,value)) + return 0; + + for(i=0;ichannel,dest,value); + } else { // Raw + ++count; + if (count>left) return stbi__errpuc("bad file","scanline overrun"); + + for(i=0;ichannel,dest)) + return 0; + } + left-=count; + } + break; + } + } + } + } + + return result; +} + +static void *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp, stbi__result_info *ri) +{ + stbi_uc *result; + int i, x,y, internal_comp; + STBI_NOTUSED(ri); + + if (!comp) comp = &internal_comp; + + for (i=0; i<92; ++i) + stbi__get8(s); + + x = stbi__get16be(s); + y = stbi__get16be(s); + + if (y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + if (x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + + if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)"); + if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode"); + + stbi__get32be(s); //skip `ratio' + stbi__get16be(s); //skip `fields' + stbi__get16be(s); //skip `pad' + + // intermediate buffer is RGBA + result = (stbi_uc *) stbi__malloc_mad3(x, y, 4, 0); + if (!result) return stbi__errpuc("outofmem", "Out of memory"); + memset(result, 0xff, x*y*4); + + if (!stbi__pic_load_core(s,x,y,comp, result)) { + STBI_FREE(result); + result=0; + } + *px = x; + *py = y; + if (req_comp == 0) req_comp = *comp; + result=stbi__convert_format(result,4,req_comp,x,y); + + return result; +} + +static int stbi__pic_test(stbi__context *s) +{ + int r = stbi__pic_test_core(s); + stbi__rewind(s); + return r; +} +#endif + +// ************************************************************************************************* +// GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb + +#ifndef STBI_NO_GIF +typedef struct +{ + stbi__int16 prefix; + stbi_uc first; + stbi_uc suffix; +} stbi__gif_lzw; + +typedef struct +{ + int w,h; + stbi_uc *out; // output buffer (always 4 components) + stbi_uc *background; // The current "background" as far as a gif is concerned + stbi_uc *history; + int flags, bgindex, ratio, transparent, eflags; + stbi_uc pal[256][4]; + stbi_uc lpal[256][4]; + stbi__gif_lzw codes[8192]; + stbi_uc *color_table; + int parse, step; + int lflags; + int start_x, start_y; + int max_x, max_y; + int cur_x, cur_y; + int line_size; + int delay; +} stbi__gif; + +static int stbi__gif_test_raw(stbi__context *s) +{ + int sz; + if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0; + sz = stbi__get8(s); + if (sz != '9' && sz != '7') return 0; + if (stbi__get8(s) != 'a') return 0; + return 1; +} + +static int stbi__gif_test(stbi__context *s) +{ + int r = stbi__gif_test_raw(s); + stbi__rewind(s); + return r; +} + +static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp) +{ + int i; + for (i=0; i < num_entries; ++i) { + pal[i][2] = stbi__get8(s); + pal[i][1] = stbi__get8(s); + pal[i][0] = stbi__get8(s); + pal[i][3] = transp == i ? 0 : 255; + } +} + +static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info) +{ + stbi_uc version; + if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') + return stbi__err("not GIF", "Corrupt GIF"); + + version = stbi__get8(s); + if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF"); + if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF"); + + stbi__g_failure_reason = ""; + g->w = stbi__get16le(s); + g->h = stbi__get16le(s); + g->flags = stbi__get8(s); + g->bgindex = stbi__get8(s); + g->ratio = stbi__get8(s); + g->transparent = -1; + + if (g->w > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)"); + if (g->h > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)"); + + if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments + + if (is_info) return 1; + + if (g->flags & 0x80) + stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1); + + return 1; +} + +static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp) +{ + stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif)); + if (!g) return stbi__err("outofmem", "Out of memory"); + if (!stbi__gif_header(s, g, comp, 1)) { + STBI_FREE(g); + stbi__rewind( s ); + return 0; + } + if (x) *x = g->w; + if (y) *y = g->h; + STBI_FREE(g); + return 1; +} + +static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code) +{ + stbi_uc *p, *c; + int idx; + + // recurse to decode the prefixes, since the linked-list is backwards, + // and working backwards through an interleaved image would be nasty + if (g->codes[code].prefix >= 0) + stbi__out_gif_code(g, g->codes[code].prefix); + + if (g->cur_y >= g->max_y) return; + + idx = g->cur_x + g->cur_y; + p = &g->out[idx]; + g->history[idx / 4] = 1; + + c = &g->color_table[g->codes[code].suffix * 4]; + if (c[3] > 128) { // don't render transparent pixels; + p[0] = c[2]; + p[1] = c[1]; + p[2] = c[0]; + p[3] = c[3]; + } + g->cur_x += 4; + + if (g->cur_x >= g->max_x) { + g->cur_x = g->start_x; + g->cur_y += g->step; + + while (g->cur_y >= g->max_y && g->parse > 0) { + g->step = (1 << g->parse) * g->line_size; + g->cur_y = g->start_y + (g->step >> 1); + --g->parse; + } + } +} + +static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g) +{ + stbi_uc lzw_cs; + stbi__int32 len, init_code; + stbi__uint32 first; + stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear; + stbi__gif_lzw *p; + + lzw_cs = stbi__get8(s); + if (lzw_cs > 12) return NULL; + clear = 1 << lzw_cs; + first = 1; + codesize = lzw_cs + 1; + codemask = (1 << codesize) - 1; + bits = 0; + valid_bits = 0; + for (init_code = 0; init_code < clear; init_code++) { + g->codes[init_code].prefix = -1; + g->codes[init_code].first = (stbi_uc) init_code; + g->codes[init_code].suffix = (stbi_uc) init_code; + } + + // support no starting clear code + avail = clear+2; + oldcode = -1; + + len = 0; + for(;;) { + if (valid_bits < codesize) { + if (len == 0) { + len = stbi__get8(s); // start new block + if (len == 0) + return g->out; + } + --len; + bits |= (stbi__int32) stbi__get8(s) << valid_bits; + valid_bits += 8; + } else { + stbi__int32 code = bits & codemask; + bits >>= codesize; + valid_bits -= codesize; + // @OPTIMIZE: is there some way we can accelerate the non-clear path? + if (code == clear) { // clear code + codesize = lzw_cs + 1; + codemask = (1 << codesize) - 1; + avail = clear + 2; + oldcode = -1; + first = 0; + } else if (code == clear + 1) { // end of stream code + stbi__skip(s, len); + while ((len = stbi__get8(s)) > 0) + stbi__skip(s,len); + return g->out; + } else if (code <= avail) { + if (first) { + return stbi__errpuc("no clear code", "Corrupt GIF"); + } + + if (oldcode >= 0) { + p = &g->codes[avail++]; + if (avail > 8192) { + return stbi__errpuc("too many codes", "Corrupt GIF"); + } + + p->prefix = (stbi__int16) oldcode; + p->first = g->codes[oldcode].first; + p->suffix = (code == avail) ? p->first : g->codes[code].first; + } else if (code == avail) + return stbi__errpuc("illegal code in raster", "Corrupt GIF"); + + stbi__out_gif_code(g, (stbi__uint16) code); + + if ((avail & codemask) == 0 && avail <= 0x0FFF) { + codesize++; + codemask = (1 << codesize) - 1; + } + + oldcode = code; + } else { + return stbi__errpuc("illegal code in raster", "Corrupt GIF"); + } + } + } +} + +// this function is designed to support animated gifs, although stb_image doesn't support it +// two back is the image from two frames ago, used for a very specific disposal format +static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp, stbi_uc *two_back) +{ + int dispose; + int first_frame; + int pi; + int pcount; + STBI_NOTUSED(req_comp); + + // on first frame, any non-written pixels get the background colour (non-transparent) + first_frame = 0; + if (g->out == 0) { + if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header + if (!stbi__mad3sizes_valid(4, g->w, g->h, 0)) + return stbi__errpuc("too large", "GIF image is too large"); + pcount = g->w * g->h; + g->out = (stbi_uc *) stbi__malloc(4 * pcount); + g->background = (stbi_uc *) stbi__malloc(4 * pcount); + g->history = (stbi_uc *) stbi__malloc(pcount); + if (!g->out || !g->background || !g->history) + return stbi__errpuc("outofmem", "Out of memory"); + + // image is treated as "transparent" at the start - ie, nothing overwrites the current background; + // background colour is only used for pixels that are not rendered first frame, after that "background" + // color refers to the color that was there the previous frame. + memset(g->out, 0x00, 4 * pcount); + memset(g->background, 0x00, 4 * pcount); // state of the background (starts transparent) + memset(g->history, 0x00, pcount); // pixels that were affected previous frame + first_frame = 1; + } else { + // second frame - how do we dispose of the previous one? + dispose = (g->eflags & 0x1C) >> 2; + pcount = g->w * g->h; + + if ((dispose == 3) && (two_back == 0)) { + dispose = 2; // if I don't have an image to revert back to, default to the old background + } + + if (dispose == 3) { // use previous graphic + for (pi = 0; pi < pcount; ++pi) { + if (g->history[pi]) { + memcpy( &g->out[pi * 4], &two_back[pi * 4], 4 ); + } + } + } else if (dispose == 2) { + // restore what was changed last frame to background before that frame; + for (pi = 0; pi < pcount; ++pi) { + if (g->history[pi]) { + memcpy( &g->out[pi * 4], &g->background[pi * 4], 4 ); + } + } + } else { + // This is a non-disposal case eithe way, so just + // leave the pixels as is, and they will become the new background + // 1: do not dispose + // 0: not specified. + } + + // background is what out is after the undoing of the previou frame; + memcpy( g->background, g->out, 4 * g->w * g->h ); + } + + // clear my history; + memset( g->history, 0x00, g->w * g->h ); // pixels that were affected previous frame + + for (;;) { + int tag = stbi__get8(s); + switch (tag) { + case 0x2C: /* Image Descriptor */ + { + stbi__int32 x, y, w, h; + stbi_uc *o; + + x = stbi__get16le(s); + y = stbi__get16le(s); + w = stbi__get16le(s); + h = stbi__get16le(s); + if (((x + w) > (g->w)) || ((y + h) > (g->h))) + return stbi__errpuc("bad Image Descriptor", "Corrupt GIF"); + + g->line_size = g->w * 4; + g->start_x = x * 4; + g->start_y = y * g->line_size; + g->max_x = g->start_x + w * 4; + g->max_y = g->start_y + h * g->line_size; + g->cur_x = g->start_x; + g->cur_y = g->start_y; + + // if the width of the specified rectangle is 0, that means + // we may not see *any* pixels or the image is malformed; + // to make sure this is caught, move the current y down to + // max_y (which is what out_gif_code checks). + if (w == 0) + g->cur_y = g->max_y; + + g->lflags = stbi__get8(s); + + if (g->lflags & 0x40) { + g->step = 8 * g->line_size; // first interlaced spacing + g->parse = 3; + } else { + g->step = g->line_size; + g->parse = 0; + } + + if (g->lflags & 0x80) { + stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1); + g->color_table = (stbi_uc *) g->lpal; + } else if (g->flags & 0x80) { + g->color_table = (stbi_uc *) g->pal; + } else + return stbi__errpuc("missing color table", "Corrupt GIF"); + + o = stbi__process_gif_raster(s, g); + if (!o) return NULL; + + // if this was the first frame, + pcount = g->w * g->h; + if (first_frame && (g->bgindex > 0)) { + // if first frame, any pixel not drawn to gets the background color + for (pi = 0; pi < pcount; ++pi) { + if (g->history[pi] == 0) { + g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be; + memcpy( &g->out[pi * 4], &g->pal[g->bgindex], 4 ); + } + } + } + + return o; + } + + case 0x21: // Comment Extension. + { + int len; + int ext = stbi__get8(s); + if (ext == 0xF9) { // Graphic Control Extension. + len = stbi__get8(s); + if (len == 4) { + g->eflags = stbi__get8(s); + g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths. + + // unset old transparent + if (g->transparent >= 0) { + g->pal[g->transparent][3] = 255; + } + if (g->eflags & 0x01) { + g->transparent = stbi__get8(s); + if (g->transparent >= 0) { + g->pal[g->transparent][3] = 0; + } + } else { + // don't need transparent + stbi__skip(s, 1); + g->transparent = -1; + } + } else { + stbi__skip(s, len); + break; + } + } + while ((len = stbi__get8(s)) != 0) { + stbi__skip(s, len); + } + break; + } + + case 0x3B: // gif stream termination code + return (stbi_uc *) s; // using '1' causes warning on some compilers + + default: + return stbi__errpuc("unknown code", "Corrupt GIF"); + } + } +} + +static void *stbi__load_gif_main_outofmem(stbi__gif *g, stbi_uc *out, int **delays) +{ + STBI_FREE(g->out); + STBI_FREE(g->history); + STBI_FREE(g->background); + + if (out) STBI_FREE(out); + if (delays && *delays) STBI_FREE(*delays); + return stbi__errpuc("outofmem", "Out of memory"); +} + +static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp) +{ + if (stbi__gif_test(s)) { + int layers = 0; + stbi_uc *u = 0; + stbi_uc *out = 0; + stbi_uc *two_back = 0; + stbi__gif g; + int stride; + int out_size = 0; + int delays_size = 0; + + STBI_NOTUSED(out_size); + STBI_NOTUSED(delays_size); + + memset(&g, 0, sizeof(g)); + if (delays) { + *delays = 0; + } + + do { + u = stbi__gif_load_next(s, &g, comp, req_comp, two_back); + if (u == (stbi_uc *) s) u = 0; // end of animated gif marker + + if (u) { + *x = g.w; + *y = g.h; + ++layers; + stride = g.w * g.h * 4; + + if (out) { + void *tmp = (stbi_uc*) STBI_REALLOC_SIZED( out, out_size, layers * stride ); + if (!tmp) + return stbi__load_gif_main_outofmem(&g, out, delays); + else { + out = (stbi_uc*) tmp; + out_size = layers * stride; + } + + if (delays) { + int *new_delays = (int*) STBI_REALLOC_SIZED( *delays, delays_size, sizeof(int) * layers ); + if (!new_delays) + return stbi__load_gif_main_outofmem(&g, out, delays); + *delays = new_delays; + delays_size = layers * sizeof(int); + } + } else { + out = (stbi_uc*)stbi__malloc( layers * stride ); + if (!out) + return stbi__load_gif_main_outofmem(&g, out, delays); + out_size = layers * stride; + if (delays) { + *delays = (int*) stbi__malloc( layers * sizeof(int) ); + if (!*delays) + return stbi__load_gif_main_outofmem(&g, out, delays); + delays_size = layers * sizeof(int); + } + } + memcpy( out + ((layers - 1) * stride), u, stride ); + if (layers >= 2) { + two_back = out - 2 * stride; + } + + if (delays) { + (*delays)[layers - 1U] = g.delay; + } + } + } while (u != 0); + + // free temp buffer; + STBI_FREE(g.out); + STBI_FREE(g.history); + STBI_FREE(g.background); + + // do the final conversion after loading everything; + if (req_comp && req_comp != 4) + out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h); + + *z = layers; + return out; + } else { + return stbi__errpuc("not GIF", "Image was not as a gif type."); + } +} + +static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri) +{ + stbi_uc *u = 0; + stbi__gif g; + memset(&g, 0, sizeof(g)); + STBI_NOTUSED(ri); + + u = stbi__gif_load_next(s, &g, comp, req_comp, 0); + if (u == (stbi_uc *) s) u = 0; // end of animated gif marker + if (u) { + *x = g.w; + *y = g.h; + + // moved conversion to after successful load so that the same + // can be done for multiple frames. + if (req_comp && req_comp != 4) + u = stbi__convert_format(u, 4, req_comp, g.w, g.h); + } else if (g.out) { + // if there was an error and we allocated an image buffer, free it! + STBI_FREE(g.out); + } + + // free buffers needed for multiple frame loading; + STBI_FREE(g.history); + STBI_FREE(g.background); + + return u; +} + +static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp) +{ + return stbi__gif_info_raw(s,x,y,comp); +} +#endif + +// ************************************************************************************************* +// Radiance RGBE HDR loader +// originally by Nicolas Schulz +#ifndef STBI_NO_HDR +static int stbi__hdr_test_core(stbi__context *s, const char *signature) +{ + int i; + for (i=0; signature[i]; ++i) + if (stbi__get8(s) != signature[i]) + return 0; + stbi__rewind(s); + return 1; +} + +static int stbi__hdr_test(stbi__context* s) +{ + int r = stbi__hdr_test_core(s, "#?RADIANCE\n"); + stbi__rewind(s); + if(!r) { + r = stbi__hdr_test_core(s, "#?RGBE\n"); + stbi__rewind(s); + } + return r; +} + +#define STBI__HDR_BUFLEN 1024 +static char *stbi__hdr_gettoken(stbi__context *z, char *buffer) +{ + int len=0; + char c = '\0'; + + c = (char) stbi__get8(z); + + while (!stbi__at_eof(z) && c != '\n') { + buffer[len++] = c; + if (len == STBI__HDR_BUFLEN-1) { + // flush to end of line + while (!stbi__at_eof(z) && stbi__get8(z) != '\n') + ; + break; + } + c = (char) stbi__get8(z); + } + + buffer[len] = 0; + return buffer; +} + +static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp) +{ + if ( input[3] != 0 ) { + float f1; + // Exponent + f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8)); + if (req_comp <= 2) + output[0] = (input[0] + input[1] + input[2]) * f1 / 3; + else { + output[0] = input[0] * f1; + output[1] = input[1] * f1; + output[2] = input[2] * f1; + } + if (req_comp == 2) output[1] = 1; + if (req_comp == 4) output[3] = 1; + } else { + switch (req_comp) { + case 4: output[3] = 1; /* fallthrough */ + case 3: output[0] = output[1] = output[2] = 0; + break; + case 2: output[1] = 1; /* fallthrough */ + case 1: output[0] = 0; + break; + } + } +} + +static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri) +{ + char buffer[STBI__HDR_BUFLEN]; + char *token; + int valid = 0; + int width, height; + stbi_uc *scanline; + float *hdr_data; + int len; + unsigned char count, value; + int i, j, k, c1,c2, z; + const char *headerToken; + STBI_NOTUSED(ri); + + // Check identifier + headerToken = stbi__hdr_gettoken(s,buffer); + if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0) + return stbi__errpf("not HDR", "Corrupt HDR image"); + + // Parse header + for(;;) { + token = stbi__hdr_gettoken(s,buffer); + if (token[0] == 0) break; + if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1; + } + + if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format"); + + // Parse width and height + // can't use sscanf() if we're not using stdio! + token = stbi__hdr_gettoken(s,buffer); + if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format"); + token += 3; + height = (int) strtol(token, &token, 10); + while (*token == ' ') ++token; + if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format"); + token += 3; + width = (int) strtol(token, NULL, 10); + + if (height > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)"); + if (width > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)"); + + *x = width; + *y = height; + + if (comp) *comp = 3; + if (req_comp == 0) req_comp = 3; + + if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0)) + return stbi__errpf("too large", "HDR image is too large"); + + // Read data + hdr_data = (float *) stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0); + if (!hdr_data) + return stbi__errpf("outofmem", "Out of memory"); + + // Load image data + // image data is stored as some number of sca + if ( width < 8 || width >= 32768) { + // Read flat data + for (j=0; j < height; ++j) { + for (i=0; i < width; ++i) { + stbi_uc rgbe[4]; + main_decode_loop: + stbi__getn(s, rgbe, 4); + stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp); + } + } + } else { + // Read RLE-encoded data + scanline = NULL; + + for (j = 0; j < height; ++j) { + c1 = stbi__get8(s); + c2 = stbi__get8(s); + len = stbi__get8(s); + if (c1 != 2 || c2 != 2 || (len & 0x80)) { + // not run-length encoded, so we have to actually use THIS data as a decoded + // pixel (note this can't be a valid pixel--one of RGB must be >= 128) + stbi_uc rgbe[4]; + rgbe[0] = (stbi_uc) c1; + rgbe[1] = (stbi_uc) c2; + rgbe[2] = (stbi_uc) len; + rgbe[3] = (stbi_uc) stbi__get8(s); + stbi__hdr_convert(hdr_data, rgbe, req_comp); + i = 1; + j = 0; + STBI_FREE(scanline); + goto main_decode_loop; // yes, this makes no sense + } + len <<= 8; + len |= stbi__get8(s); + if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); } + if (scanline == NULL) { + scanline = (stbi_uc *) stbi__malloc_mad2(width, 4, 0); + if (!scanline) { + STBI_FREE(hdr_data); + return stbi__errpf("outofmem", "Out of memory"); + } + } + + for (k = 0; k < 4; ++k) { + int nleft; + i = 0; + while ((nleft = width - i) > 0) { + count = stbi__get8(s); + if (count > 128) { + // Run + value = stbi__get8(s); + count -= 128; + if ((count == 0) || (count > nleft)) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); } + for (z = 0; z < count; ++z) + scanline[i++ * 4 + k] = value; + } else { + // Dump + if ((count == 0) || (count > nleft)) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); } + for (z = 0; z < count; ++z) + scanline[i++ * 4 + k] = stbi__get8(s); + } + } + } + for (i=0; i < width; ++i) + stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp); + } + if (scanline) + STBI_FREE(scanline); + } + + return hdr_data; +} + +static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp) +{ + char buffer[STBI__HDR_BUFLEN]; + char *token; + int valid = 0; + int dummy; + + if (!x) x = &dummy; + if (!y) y = &dummy; + if (!comp) comp = &dummy; + + if (stbi__hdr_test(s) == 0) { + stbi__rewind( s ); + return 0; + } + + for(;;) { + token = stbi__hdr_gettoken(s,buffer); + if (token[0] == 0) break; + if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1; + } + + if (!valid) { + stbi__rewind( s ); + return 0; + } + token = stbi__hdr_gettoken(s,buffer); + if (strncmp(token, "-Y ", 3)) { + stbi__rewind( s ); + return 0; + } + token += 3; + *y = (int) strtol(token, &token, 10); + while (*token == ' ') ++token; + if (strncmp(token, "+X ", 3)) { + stbi__rewind( s ); + return 0; + } + token += 3; + *x = (int) strtol(token, NULL, 10); + *comp = 3; + return 1; +} +#endif // STBI_NO_HDR + +#ifndef STBI_NO_BMP +static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp) +{ + void *p; + stbi__bmp_data info; + + info.all_a = 255; + p = stbi__bmp_parse_header(s, &info); + if (p == NULL) { + stbi__rewind( s ); + return 0; + } + if (x) *x = s->img_x; + if (y) *y = s->img_y; + if (comp) { + if (info.bpp == 24 && info.ma == 0xff000000) + *comp = 3; + else + *comp = info.ma ? 4 : 3; + } + return 1; +} +#endif + +#ifndef STBI_NO_PSD +static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp) +{ + int channelCount, dummy, depth; + if (!x) x = &dummy; + if (!y) y = &dummy; + if (!comp) comp = &dummy; + if (stbi__get32be(s) != 0x38425053) { + stbi__rewind( s ); + return 0; + } + if (stbi__get16be(s) != 1) { + stbi__rewind( s ); + return 0; + } + stbi__skip(s, 6); + channelCount = stbi__get16be(s); + if (channelCount < 0 || channelCount > 16) { + stbi__rewind( s ); + return 0; + } + *y = stbi__get32be(s); + *x = stbi__get32be(s); + depth = stbi__get16be(s); + if (depth != 8 && depth != 16) { + stbi__rewind( s ); + return 0; + } + if (stbi__get16be(s) != 3) { + stbi__rewind( s ); + return 0; + } + *comp = 4; + return 1; +} + +static int stbi__psd_is16(stbi__context *s) +{ + int channelCount, depth; + if (stbi__get32be(s) != 0x38425053) { + stbi__rewind( s ); + return 0; + } + if (stbi__get16be(s) != 1) { + stbi__rewind( s ); + return 0; + } + stbi__skip(s, 6); + channelCount = stbi__get16be(s); + if (channelCount < 0 || channelCount > 16) { + stbi__rewind( s ); + return 0; + } + STBI_NOTUSED(stbi__get32be(s)); + STBI_NOTUSED(stbi__get32be(s)); + depth = stbi__get16be(s); + if (depth != 16) { + stbi__rewind( s ); + return 0; + } + return 1; +} +#endif + +#ifndef STBI_NO_PIC +static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp) +{ + int act_comp=0,num_packets=0,chained,dummy; + stbi__pic_packet packets[10]; + + if (!x) x = &dummy; + if (!y) y = &dummy; + if (!comp) comp = &dummy; + + if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) { + stbi__rewind(s); + return 0; + } + + stbi__skip(s, 88); + + *x = stbi__get16be(s); + *y = stbi__get16be(s); + if (stbi__at_eof(s)) { + stbi__rewind( s); + return 0; + } + if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) { + stbi__rewind( s ); + return 0; + } + + stbi__skip(s, 8); + + do { + stbi__pic_packet *packet; + + if (num_packets==sizeof(packets)/sizeof(packets[0])) + return 0; + + packet = &packets[num_packets++]; + chained = stbi__get8(s); + packet->size = stbi__get8(s); + packet->type = stbi__get8(s); + packet->channel = stbi__get8(s); + act_comp |= packet->channel; + + if (stbi__at_eof(s)) { + stbi__rewind( s ); + return 0; + } + if (packet->size != 8) { + stbi__rewind( s ); + return 0; + } + } while (chained); + + *comp = (act_comp & 0x10 ? 4 : 3); + + return 1; +} +#endif + +// ************************************************************************************************* +// Portable Gray Map and Portable Pixel Map loader +// by Ken Miller +// +// PGM: http://netpbm.sourceforge.net/doc/pgm.html +// PPM: http://netpbm.sourceforge.net/doc/ppm.html +// +// Known limitations: +// Does not support comments in the header section +// Does not support ASCII image data (formats P2 and P3) + +#ifndef STBI_NO_PNM + +static int stbi__pnm_test(stbi__context *s) +{ + char p, t; + p = (char) stbi__get8(s); + t = (char) stbi__get8(s); + if (p != 'P' || (t != '5' && t != '6')) { + stbi__rewind( s ); + return 0; + } + return 1; +} + +static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri) +{ + stbi_uc *out; + STBI_NOTUSED(ri); + + ri->bits_per_channel = stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n); + if (ri->bits_per_channel == 0) + return 0; + + if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)"); + + *x = s->img_x; + *y = s->img_y; + if (comp) *comp = s->img_n; + + if (!stbi__mad4sizes_valid(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0)) + return stbi__errpuc("too large", "PNM too large"); + + out = (stbi_uc *) stbi__malloc_mad4(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0); + if (!out) return stbi__errpuc("outofmem", "Out of memory"); + if (!stbi__getn(s, out, s->img_n * s->img_x * s->img_y * (ri->bits_per_channel / 8))) { + STBI_FREE(out); + return stbi__errpuc("bad PNM", "PNM file truncated"); + } + + if (req_comp && req_comp != s->img_n) { + if (ri->bits_per_channel == 16) { + out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, s->img_n, req_comp, s->img_x, s->img_y); + } else { + out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y); + } + if (out == NULL) return out; // stbi__convert_format frees input on failure + } + return out; +} + +static int stbi__pnm_isspace(char c) +{ + return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r'; +} + +static void stbi__pnm_skip_whitespace(stbi__context *s, char *c) +{ + for (;;) { + while (!stbi__at_eof(s) && stbi__pnm_isspace(*c)) + *c = (char) stbi__get8(s); + + if (stbi__at_eof(s) || *c != '#') + break; + + while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' ) + *c = (char) stbi__get8(s); + } +} + +static int stbi__pnm_isdigit(char c) +{ + return c >= '0' && c <= '9'; +} + +static int stbi__pnm_getinteger(stbi__context *s, char *c) +{ + int value = 0; + + while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) { + value = value*10 + (*c - '0'); + *c = (char) stbi__get8(s); + if((value > 214748364) || (value == 214748364 && *c > '7')) + return stbi__err("integer parse overflow", "Parsing an integer in the PPM header overflowed a 32-bit int"); + } + + return value; +} + +static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp) +{ + int maxv, dummy; + char c, p, t; + + if (!x) x = &dummy; + if (!y) y = &dummy; + if (!comp) comp = &dummy; + + stbi__rewind(s); + + // Get identifier + p = (char) stbi__get8(s); + t = (char) stbi__get8(s); + if (p != 'P' || (t != '5' && t != '6')) { + stbi__rewind(s); + return 0; + } + + *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm + + c = (char) stbi__get8(s); + stbi__pnm_skip_whitespace(s, &c); + + *x = stbi__pnm_getinteger(s, &c); // read width + if(*x == 0) + return stbi__err("invalid width", "PPM image header had zero or overflowing width"); + stbi__pnm_skip_whitespace(s, &c); + + *y = stbi__pnm_getinteger(s, &c); // read height + if (*y == 0) + return stbi__err("invalid width", "PPM image header had zero or overflowing width"); + stbi__pnm_skip_whitespace(s, &c); + + maxv = stbi__pnm_getinteger(s, &c); // read max value + if (maxv > 65535) + return stbi__err("max value > 65535", "PPM image supports only 8-bit and 16-bit images"); + else if (maxv > 255) + return 16; + else + return 8; +} + +static int stbi__pnm_is16(stbi__context *s) +{ + if (stbi__pnm_info(s, NULL, NULL, NULL) == 16) + return 1; + return 0; +} +#endif + +static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp) +{ + #ifndef STBI_NO_JPEG + if (stbi__jpeg_info(s, x, y, comp)) return 1; + #endif + + #ifndef STBI_NO_PNG + if (stbi__png_info(s, x, y, comp)) return 1; + #endif + + #ifndef STBI_NO_GIF + if (stbi__gif_info(s, x, y, comp)) return 1; + #endif + + #ifndef STBI_NO_BMP + if (stbi__bmp_info(s, x, y, comp)) return 1; + #endif + + #ifndef STBI_NO_PSD + if (stbi__psd_info(s, x, y, comp)) return 1; + #endif + + #ifndef STBI_NO_PIC + if (stbi__pic_info(s, x, y, comp)) return 1; + #endif + + #ifndef STBI_NO_PNM + if (stbi__pnm_info(s, x, y, comp)) return 1; + #endif + + #ifndef STBI_NO_HDR + if (stbi__hdr_info(s, x, y, comp)) return 1; + #endif + + // test tga last because it's a crappy test! + #ifndef STBI_NO_TGA + if (stbi__tga_info(s, x, y, comp)) + return 1; + #endif + return stbi__err("unknown image type", "Image not of any known type, or corrupt"); +} + +static int stbi__is_16_main(stbi__context *s) +{ + #ifndef STBI_NO_PNG + if (stbi__png_is16(s)) return 1; + #endif + + #ifndef STBI_NO_PSD + if (stbi__psd_is16(s)) return 1; + #endif + + #ifndef STBI_NO_PNM + if (stbi__pnm_is16(s)) return 1; + #endif + return 0; +} + +#ifndef STBI_NO_STDIO +STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp) +{ + FILE *f = stbi__fopen(filename, "rb"); + int result; + if (!f) return stbi__err("can't fopen", "Unable to open file"); + result = stbi_info_from_file(f, x, y, comp); + fclose(f); + return result; +} + +STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp) +{ + int r; + stbi__context s; + long pos = ftell(f); + stbi__start_file(&s, f); + r = stbi__info_main(&s,x,y,comp); + fseek(f,pos,SEEK_SET); + return r; +} + +STBIDEF int stbi_is_16_bit(char const *filename) +{ + FILE *f = stbi__fopen(filename, "rb"); + int result; + if (!f) return stbi__err("can't fopen", "Unable to open file"); + result = stbi_is_16_bit_from_file(f); + fclose(f); + return result; +} + +STBIDEF int stbi_is_16_bit_from_file(FILE *f) +{ + int r; + stbi__context s; + long pos = ftell(f); + stbi__start_file(&s, f); + r = stbi__is_16_main(&s); + fseek(f,pos,SEEK_SET); + return r; +} +#endif // !STBI_NO_STDIO + +STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp) +{ + stbi__context s; + stbi__start_mem(&s,buffer,len); + return stbi__info_main(&s,x,y,comp); +} + +STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp) +{ + stbi__context s; + stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user); + return stbi__info_main(&s,x,y,comp); +} + +STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len) +{ + stbi__context s; + stbi__start_mem(&s,buffer,len); + return stbi__is_16_main(&s); +} + +STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *c, void *user) +{ + stbi__context s; + stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user); + return stbi__is_16_main(&s); +} + +#endif // STB_IMAGE_IMPLEMENTATION + +/* + revision history: + 2.20 (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs + 2.19 (2018-02-11) fix warning + 2.18 (2018-01-30) fix warnings + 2.17 (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug + 1-bit BMP + *_is_16_bit api + avoid warnings + 2.16 (2017-07-23) all functions have 16-bit variants; + STBI_NO_STDIO works again; + compilation fixes; + fix rounding in unpremultiply; + optimize vertical flip; + disable raw_len validation; + documentation fixes + 2.15 (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode; + warning fixes; disable run-time SSE detection on gcc; + uniform handling of optional "return" values; + thread-safe initialization of zlib tables + 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs + 2.13 (2016-11-29) add 16-bit API, only supported for PNG right now + 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes + 2.11 (2016-04-02) allocate large structures on the stack + remove white matting for transparent PSD + fix reported channel count for PNG & BMP + re-enable SSE2 in non-gcc 64-bit + support RGB-formatted JPEG + read 16-bit PNGs (only as 8-bit) + 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED + 2.09 (2016-01-16) allow comments in PNM files + 16-bit-per-pixel TGA (not bit-per-component) + info() for TGA could break due to .hdr handling + info() for BMP to shares code instead of sloppy parse + can use STBI_REALLOC_SIZED if allocator doesn't support realloc + code cleanup + 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA + 2.07 (2015-09-13) fix compiler warnings + partial animated GIF support + limited 16-bpc PSD support + #ifdef unused functions + bug with < 92 byte PIC,PNM,HDR,TGA + 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value + 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning + 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit + 2.03 (2015-04-12) extra corruption checking (mmozeiko) + stbi_set_flip_vertically_on_load (nguillemot) + fix NEON support; fix mingw support + 2.02 (2015-01-19) fix incorrect assert, fix warning + 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2 + 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG + 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg) + progressive JPEG (stb) + PGM/PPM support (Ken Miller) + STBI_MALLOC,STBI_REALLOC,STBI_FREE + GIF bugfix -- seemingly never worked + STBI_NO_*, STBI_ONLY_* + 1.48 (2014-12-14) fix incorrectly-named assert() + 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb) + optimize PNG (ryg) + fix bug in interlaced PNG with user-specified channel count (stb) + 1.46 (2014-08-26) + fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG + 1.45 (2014-08-16) + fix MSVC-ARM internal compiler error by wrapping malloc + 1.44 (2014-08-07) + various warning fixes from Ronny Chevalier + 1.43 (2014-07-15) + fix MSVC-only compiler problem in code changed in 1.42 + 1.42 (2014-07-09) + don't define _CRT_SECURE_NO_WARNINGS (affects user code) + fixes to stbi__cleanup_jpeg path + added STBI_ASSERT to avoid requiring assert.h + 1.41 (2014-06-25) + fix search&replace from 1.36 that messed up comments/error messages + 1.40 (2014-06-22) + fix gcc struct-initialization warning + 1.39 (2014-06-15) + fix to TGA optimization when req_comp != number of components in TGA; + fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite) + add support for BMP version 5 (more ignored fields) + 1.38 (2014-06-06) + suppress MSVC warnings on integer casts truncating values + fix accidental rename of 'skip' field of I/O + 1.37 (2014-06-04) + remove duplicate typedef + 1.36 (2014-06-03) + convert to header file single-file library + if de-iphone isn't set, load iphone images color-swapped instead of returning NULL + 1.35 (2014-05-27) + various warnings + fix broken STBI_SIMD path + fix bug where stbi_load_from_file no longer left file pointer in correct place + fix broken non-easy path for 32-bit BMP (possibly never used) + TGA optimization by Arseny Kapoulkine + 1.34 (unknown) + use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case + 1.33 (2011-07-14) + make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements + 1.32 (2011-07-13) + support for "info" function for all supported filetypes (SpartanJ) + 1.31 (2011-06-20) + a few more leak fixes, bug in PNG handling (SpartanJ) + 1.30 (2011-06-11) + added ability to load files via callbacks to accomidate custom input streams (Ben Wenger) + removed deprecated format-specific test/load functions + removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway + error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha) + fix inefficiency in decoding 32-bit BMP (David Woo) + 1.29 (2010-08-16) + various warning fixes from Aurelien Pocheville + 1.28 (2010-08-01) + fix bug in GIF palette transparency (SpartanJ) + 1.27 (2010-08-01) + cast-to-stbi_uc to fix warnings + 1.26 (2010-07-24) + fix bug in file buffering for PNG reported by SpartanJ + 1.25 (2010-07-17) + refix trans_data warning (Won Chun) + 1.24 (2010-07-12) + perf improvements reading from files on platforms with lock-heavy fgetc() + minor perf improvements for jpeg + deprecated type-specific functions so we'll get feedback if they're needed + attempt to fix trans_data warning (Won Chun) + 1.23 fixed bug in iPhone support + 1.22 (2010-07-10) + removed image *writing* support + stbi_info support from Jetro Lauha + GIF support from Jean-Marc Lienher + iPhone PNG-extensions from James Brown + warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva) + 1.21 fix use of 'stbi_uc' in header (reported by jon blow) + 1.20 added support for Softimage PIC, by Tom Seddon + 1.19 bug in interlaced PNG corruption check (found by ryg) + 1.18 (2008-08-02) + fix a threading bug (local mutable static) + 1.17 support interlaced PNG + 1.16 major bugfix - stbi__convert_format converted one too many pixels + 1.15 initialize some fields for thread safety + 1.14 fix threadsafe conversion bug + header-file-only version (#define STBI_HEADER_FILE_ONLY before including) + 1.13 threadsafe + 1.12 const qualifiers in the API + 1.11 Support installable IDCT, colorspace conversion routines + 1.10 Fixes for 64-bit (don't use "unsigned long") + optimized upsampling by Fabian "ryg" Giesen + 1.09 Fix format-conversion for PSD code (bad global variables!) + 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz + 1.07 attempt to fix C++ warning/errors again + 1.06 attempt to fix C++ warning/errors again + 1.05 fix TGA loading to return correct *comp and use good luminance calc + 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free + 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR + 1.02 support for (subset of) HDR files, float interface for preferred access to them + 1.01 fix bug: possible bug in handling right-side up bmps... not sure + fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all + 1.00 interface to zlib that skips zlib header + 0.99 correct handling of alpha in palette + 0.98 TGA loader by lonesock; dynamically add loaders (untested) + 0.97 jpeg errors on too large a file; also catch another malloc failure + 0.96 fix detection of invalid v value - particleman@mollyrocket forum + 0.95 during header scan, seek to markers in case of padding + 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same + 0.93 handle jpegtran output; verbose errors + 0.92 read 4,8,16,24,32-bit BMP files of several formats + 0.91 output 24-bit Windows 3.0 BMP files + 0.90 fix a few more warnings; bump version number to approach 1.0 + 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd + 0.60 fix compiling as c++ + 0.59 fix warnings: merge Dave Moore's -Wall fixes + 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian + 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available + 0.56 fix bug: zlib uncompressed mode len vs. nlen + 0.55 fix bug: restart_interval not initialized to 0 + 0.54 allow NULL for 'int *comp' + 0.53 fix bug in png 3->4; speedup png decoding + 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments + 0.51 obey req_comp requests, 1-component jpegs return as 1-component, + on 'test' only check type, not whether we support this variant + 0.50 (2006-11-19) + first released version +*/ + + +/* +------------------------------------------------------------------------------ +This software is available under 2 licenses -- choose whichever you prefer. +------------------------------------------------------------------------------ +ALTERNATIVE A - MIT License +Copyright (c) 2017 Sean Barrett +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +------------------------------------------------------------------------------ +ALTERNATIVE B - Public Domain (www.unlicense.org) +This is free and unencumbered software released into the public domain. +Anyone is free to copy, modify, publish, use, compile, sell, or distribute this +software, either in source code form or as a compiled binary, for any purpose, +commercial or non-commercial, and by any means. +In jurisdictions that recognize copyright laws, the author or authors of this +software dedicate any and all copyright interest in the software to the public +domain. We make this dedication for the benefit of the public at large and to +the detriment of our heirs and successors. We intend this dedication to be an +overt act of relinquishment in perpetuity of all present and future rights to +this software under copyright law. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +------------------------------------------------------------------------------ +*/ diff --git a/modules/v4d/third/stb/stb_truetype.h b/modules/v4d/third/stb/stb_truetype.h new file mode 100644 index 000000000..bbf2284b1 --- /dev/null +++ b/modules/v4d/third/stb/stb_truetype.h @@ -0,0 +1,5077 @@ +// stb_truetype.h - v1.26 - public domain +// authored from 2009-2021 by Sean Barrett / RAD Game Tools +// +// ======================================================================= +// +// NO SECURITY GUARANTEE -- DO NOT USE THIS ON UNTRUSTED FONT FILES +// +// This library does no range checking of the offsets found in the file, +// meaning an attacker can use it to read arbitrary memory. +// +// ======================================================================= +// +// This library processes TrueType files: +// parse files +// extract glyph metrics +// extract glyph shapes +// render glyphs to one-channel bitmaps with antialiasing (box filter) +// render glyphs to one-channel SDF bitmaps (signed-distance field/function) +// +// Todo: +// non-MS cmaps +// crashproof on bad data +// hinting? (no longer patented) +// cleartype-style AA? +// optimize: use simple memory allocator for intermediates +// optimize: build edge-list directly from curves +// optimize: rasterize directly from curves? +// +// ADDITIONAL CONTRIBUTORS +// +// Mikko Mononen: compound shape support, more cmap formats +// Tor Andersson: kerning, subpixel rendering +// Dougall Johnson: OpenType / Type 2 font handling +// Daniel Ribeiro Maciel: basic GPOS-based kerning +// +// Misc other: +// Ryan Gordon +// Simon Glass +// github:IntellectualKitty +// Imanol Celaya +// Daniel Ribeiro Maciel +// +// Bug/warning reports/fixes: +// "Zer" on mollyrocket Fabian "ryg" Giesen github:NiLuJe +// Cass Everitt Martins Mozeiko github:aloucks +// stoiko (Haemimont Games) Cap Petschulat github:oyvindjam +// Brian Hook Omar Cornut github:vassvik +// Walter van Niftrik Ryan Griege +// David Gow Peter LaValle +// David Given Sergey Popov +// Ivan-Assen Ivanov Giumo X. Clanjor +// Anthony Pesch Higor Euripedes +// Johan Duparc Thomas Fields +// Hou Qiming Derek Vinyard +// Rob Loach Cort Stratton +// Kenney Phillis Jr. Brian Costabile +// Ken Voskuil (kaesve) +// +// VERSION HISTORY +// +// 1.26 (2021-08-28) fix broken rasterizer +// 1.25 (2021-07-11) many fixes +// 1.24 (2020-02-05) fix warning +// 1.23 (2020-02-02) query SVG data for glyphs; query whole kerning table (but only kern not GPOS) +// 1.22 (2019-08-11) minimize missing-glyph duplication; fix kerning if both 'GPOS' and 'kern' are defined +// 1.21 (2019-02-25) fix warning +// 1.20 (2019-02-07) PackFontRange skips missing codepoints; GetScaleFontVMetrics() +// 1.19 (2018-02-11) GPOS kerning, STBTT_fmod +// 1.18 (2018-01-29) add missing function +// 1.17 (2017-07-23) make more arguments const; doc fix +// 1.16 (2017-07-12) SDF support +// 1.15 (2017-03-03) make more arguments const +// 1.14 (2017-01-16) num-fonts-in-TTC function +// 1.13 (2017-01-02) support OpenType fonts, certain Apple fonts +// 1.12 (2016-10-25) suppress warnings about casting away const with -Wcast-qual +// 1.11 (2016-04-02) fix unused-variable warning +// 1.10 (2016-04-02) user-defined fabs(); rare memory leak; remove duplicate typedef +// 1.09 (2016-01-16) warning fix; avoid crash on outofmem; use allocation userdata properly +// 1.08 (2015-09-13) document stbtt_Rasterize(); fixes for vertical & horizontal edges +// 1.07 (2015-08-01) allow PackFontRanges to accept arrays of sparse codepoints; +// variant PackFontRanges to pack and render in separate phases; +// fix stbtt_GetFontOFfsetForIndex (never worked for non-0 input?); +// fixed an assert() bug in the new rasterizer +// replace assert() with STBTT_assert() in new rasterizer +// +// Full history can be found at the end of this file. +// +// LICENSE +// +// See end of file for license information. +// +// USAGE +// +// Include this file in whatever places need to refer to it. In ONE C/C++ +// file, write: +// #define STB_TRUETYPE_IMPLEMENTATION +// before the #include of this file. This expands out the actual +// implementation into that C/C++ file. +// +// To make the implementation private to the file that generates the implementation, +// #define STBTT_STATIC +// +// Simple 3D API (don't ship this, but it's fine for tools and quick start) +// stbtt_BakeFontBitmap() -- bake a font to a bitmap for use as texture +// stbtt_GetBakedQuad() -- compute quad to draw for a given char +// +// Improved 3D API (more shippable): +// #include "stb_rect_pack.h" -- optional, but you really want it +// stbtt_PackBegin() +// stbtt_PackSetOversampling() -- for improved quality on small fonts +// stbtt_PackFontRanges() -- pack and renders +// stbtt_PackEnd() +// stbtt_GetPackedQuad() +// +// "Load" a font file from a memory buffer (you have to keep the buffer loaded) +// stbtt_InitFont() +// stbtt_GetFontOffsetForIndex() -- indexing for TTC font collections +// stbtt_GetNumberOfFonts() -- number of fonts for TTC font collections +// +// Render a unicode codepoint to a bitmap +// stbtt_GetCodepointBitmap() -- allocates and returns a bitmap +// stbtt_MakeCodepointBitmap() -- renders into bitmap you provide +// stbtt_GetCodepointBitmapBox() -- how big the bitmap must be +// +// Character advance/positioning +// stbtt_GetCodepointHMetrics() +// stbtt_GetFontVMetrics() +// stbtt_GetFontVMetricsOS2() +// stbtt_GetCodepointKernAdvance() +// +// Starting with version 1.06, the rasterizer was replaced with a new, +// faster and generally-more-precise rasterizer. The new rasterizer more +// accurately measures pixel coverage for anti-aliasing, except in the case +// where multiple shapes overlap, in which case it overestimates the AA pixel +// coverage. Thus, anti-aliasing of intersecting shapes may look wrong. If +// this turns out to be a problem, you can re-enable the old rasterizer with +// #define STBTT_RASTERIZER_VERSION 1 +// which will incur about a 15% speed hit. +// +// ADDITIONAL DOCUMENTATION +// +// Immediately after this block comment are a series of sample programs. +// +// After the sample programs is the "header file" section. This section +// includes documentation for each API function. +// +// Some important concepts to understand to use this library: +// +// Codepoint +// Characters are defined by unicode codepoints, e.g. 65 is +// uppercase A, 231 is lowercase c with a cedilla, 0x7e30 is +// the hiragana for "ma". +// +// Glyph +// A visual character shape (every codepoint is rendered as +// some glyph) +// +// Glyph index +// A font-specific integer ID representing a glyph +// +// Baseline +// Glyph shapes are defined relative to a baseline, which is the +// bottom of uppercase characters. Characters extend both above +// and below the baseline. +// +// Current Point +// As you draw text to the screen, you keep track of a "current point" +// which is the origin of each character. The current point's vertical +// position is the baseline. Even "baked fonts" use this model. +// +// Vertical Font Metrics +// The vertical qualities of the font, used to vertically position +// and space the characters. See docs for stbtt_GetFontVMetrics. +// +// Font Size in Pixels or Points +// The preferred interface for specifying font sizes in stb_truetype +// is to specify how tall the font's vertical extent should be in pixels. +// If that sounds good enough, skip the next paragraph. +// +// Most font APIs instead use "points", which are a common typographic +// measurement for describing font size, defined as 72 points per inch. +// stb_truetype provides a point API for compatibility. However, true +// "per inch" conventions don't make much sense on computer displays +// since different monitors have different number of pixels per +// inch. For example, Windows traditionally uses a convention that +// there are 96 pixels per inch, thus making 'inch' measurements have +// nothing to do with inches, and thus effectively defining a point to +// be 1.333 pixels. Additionally, the TrueType font data provides +// an explicit scale factor to scale a given font's glyphs to points, +// but the author has observed that this scale factor is often wrong +// for non-commercial fonts, thus making fonts scaled in points +// according to the TrueType spec incoherently sized in practice. +// +// DETAILED USAGE: +// +// Scale: +// Select how high you want the font to be, in points or pixels. +// Call ScaleForPixelHeight or ScaleForMappingEmToPixels to compute +// a scale factor SF that will be used by all other functions. +// +// Baseline: +// You need to select a y-coordinate that is the baseline of where +// your text will appear. Call GetFontBoundingBox to get the baseline-relative +// bounding box for all characters. SF*-y0 will be the distance in pixels +// that the worst-case character could extend above the baseline, so if +// you want the top edge of characters to appear at the top of the +// screen where y=0, then you would set the baseline to SF*-y0. +// +// Current point: +// Set the current point where the first character will appear. The +// first character could extend left of the current point; this is font +// dependent. You can either choose a current point that is the leftmost +// point and hope, or add some padding, or check the bounding box or +// left-side-bearing of the first character to be displayed and set +// the current point based on that. +// +// Displaying a character: +// Compute the bounding box of the character. It will contain signed values +// relative to . I.e. if it returns x0,y0,x1,y1, +// then the character should be displayed in the rectangle from +// to = 32 && *text < 128) { + stbtt_aligned_quad q; + stbtt_GetBakedQuad(cdata, 512,512, *text-32, &x,&y,&q,1);//1=opengl & d3d10+,0=d3d9 + glTexCoord2f(q.s0,q.t0); glVertex2f(q.x0,q.y0); + glTexCoord2f(q.s1,q.t0); glVertex2f(q.x1,q.y0); + glTexCoord2f(q.s1,q.t1); glVertex2f(q.x1,q.y1); + glTexCoord2f(q.s0,q.t1); glVertex2f(q.x0,q.y1); + } + ++text; + } + glEnd(); +} +#endif +// +// +////////////////////////////////////////////////////////////////////////////// +// +// Complete program (this compiles): get a single bitmap, print as ASCII art +// +#if 0 +#include +#define STB_TRUETYPE_IMPLEMENTATION // force following include to generate implementation +#include "stb_truetype.h" + +char ttf_buffer[1<<25]; + +int main(int argc, char **argv) +{ + stbtt_fontinfo font; + unsigned char *bitmap; + int w,h,i,j,c = (argc > 1 ? atoi(argv[1]) : 'a'), s = (argc > 2 ? atoi(argv[2]) : 20); + + fread(ttf_buffer, 1, 1<<25, fopen(argc > 3 ? argv[3] : "c:/windows/fonts/arialbd.ttf", "rb")); + + stbtt_InitFont(&font, ttf_buffer, stbtt_GetFontOffsetForIndex(ttf_buffer,0)); + bitmap = stbtt_GetCodepointBitmap(&font, 0,stbtt_ScaleForPixelHeight(&font, s), c, &w, &h, 0,0); + + for (j=0; j < h; ++j) { + for (i=0; i < w; ++i) + putchar(" .:ioVM@"[bitmap[j*w+i]>>5]); + putchar('\n'); + } + return 0; +} +#endif +// +// Output: +// +// .ii. +// @@@@@@. +// V@Mio@@o +// :i. V@V +// :oM@@M +// :@@@MM@M +// @@o o@M +// :@@. M@M +// @@@o@@@@ +// :M@@V:@@. +// +////////////////////////////////////////////////////////////////////////////// +// +// Complete program: print "Hello World!" banner, with bugs +// +#if 0 +char buffer[24<<20]; +unsigned char screen[20][79]; + +int main(int arg, char **argv) +{ + stbtt_fontinfo font; + int i,j,ascent,baseline,ch=0; + float scale, xpos=2; // leave a little padding in case the character extends left + char *text = "Heljo World!"; // intentionally misspelled to show 'lj' brokenness + + fread(buffer, 1, 1000000, fopen("c:/windows/fonts/arialbd.ttf", "rb")); + stbtt_InitFont(&font, buffer, 0); + + scale = stbtt_ScaleForPixelHeight(&font, 15); + stbtt_GetFontVMetrics(&font, &ascent,0,0); + baseline = (int) (ascent*scale); + + while (text[ch]) { + int advance,lsb,x0,y0,x1,y1; + float x_shift = xpos - (float) floor(xpos); + stbtt_GetCodepointHMetrics(&font, text[ch], &advance, &lsb); + stbtt_GetCodepointBitmapBoxSubpixel(&font, text[ch], scale,scale,x_shift,0, &x0,&y0,&x1,&y1); + stbtt_MakeCodepointBitmapSubpixel(&font, &screen[baseline + y0][(int) xpos + x0], x1-x0,y1-y0, 79, scale,scale,x_shift,0, text[ch]); + // note that this stomps the old data, so where character boxes overlap (e.g. 'lj') it's wrong + // because this API is really for baking character bitmaps into textures. if you want to render + // a sequence of characters, you really need to render each bitmap to a temp buffer, then + // "alpha blend" that into the working buffer + xpos += (advance * scale); + if (text[ch+1]) + xpos += scale*stbtt_GetCodepointKernAdvance(&font, text[ch],text[ch+1]); + ++ch; + } + + for (j=0; j < 20; ++j) { + for (i=0; i < 78; ++i) + putchar(" .:ioVM@"[screen[j][i]>>5]); + putchar('\n'); + } + + return 0; +} +#endif + + +////////////////////////////////////////////////////////////////////////////// +////////////////////////////////////////////////////////////////////////////// +//// +//// INTEGRATION WITH YOUR CODEBASE +//// +//// The following sections allow you to supply alternate definitions +//// of C library functions used by stb_truetype, e.g. if you don't +//// link with the C runtime library. + +#ifdef STB_TRUETYPE_IMPLEMENTATION + // #define your own (u)stbtt_int8/16/32 before including to override this + #ifndef stbtt_uint8 + typedef unsigned char stbtt_uint8; + typedef signed char stbtt_int8; + typedef unsigned short stbtt_uint16; + typedef signed short stbtt_int16; + typedef unsigned int stbtt_uint32; + typedef signed int stbtt_int32; + #endif + + typedef char stbtt__check_size32[sizeof(stbtt_int32)==4 ? 1 : -1]; + typedef char stbtt__check_size16[sizeof(stbtt_int16)==2 ? 1 : -1]; + + // e.g. #define your own STBTT_ifloor/STBTT_iceil() to avoid math.h + #ifndef STBTT_ifloor + #include + #define STBTT_ifloor(x) ((int) floor(x)) + #define STBTT_iceil(x) ((int) ceil(x)) + #endif + + #ifndef STBTT_sqrt + #include + #define STBTT_sqrt(x) sqrt(x) + #define STBTT_pow(x,y) pow(x,y) + #endif + + #ifndef STBTT_fmod + #include + #define STBTT_fmod(x,y) fmod(x,y) + #endif + + #ifndef STBTT_cos + #include + #define STBTT_cos(x) cos(x) + #define STBTT_acos(x) acos(x) + #endif + + #ifndef STBTT_fabs + #include + #define STBTT_fabs(x) fabs(x) + #endif + + // #define your own functions "STBTT_malloc" / "STBTT_free" to avoid malloc.h + #ifndef STBTT_malloc + #include + #define STBTT_malloc(x,u) ((void)(u),malloc(x)) + #define STBTT_free(x,u) ((void)(u),free(x)) + #endif + + #ifndef STBTT_assert + #include + #define STBTT_assert(x) assert(x) + #endif + + #ifndef STBTT_strlen + #include + #define STBTT_strlen(x) strlen(x) + #endif + + #ifndef STBTT_memcpy + #include + #define STBTT_memcpy memcpy + #define STBTT_memset memset + #endif +#endif + +/////////////////////////////////////////////////////////////////////////////// +/////////////////////////////////////////////////////////////////////////////// +//// +//// INTERFACE +//// +//// + +#ifndef __STB_INCLUDE_STB_TRUETYPE_H__ +#define __STB_INCLUDE_STB_TRUETYPE_H__ + +#ifdef STBTT_STATIC +#define STBTT_DEF static +#else +#define STBTT_DEF extern +#endif + +#ifdef __cplusplus +extern "C" { +#endif + +// private structure +typedef struct +{ + unsigned char *data; + int cursor; + int size; +} stbtt__buf; + +////////////////////////////////////////////////////////////////////////////// +// +// TEXTURE BAKING API +// +// If you use this API, you only have to call two functions ever. +// + +typedef struct +{ + unsigned short x0,y0,x1,y1; // coordinates of bbox in bitmap + float xoff,yoff,xadvance; +} stbtt_bakedchar; + +STBTT_DEF int stbtt_BakeFontBitmap(const unsigned char *data, int offset, // font location (use offset=0 for plain .ttf) + float pixel_height, // height of font in pixels + unsigned char *pixels, int pw, int ph, // bitmap to be filled in + int first_char, int num_chars, // characters to bake + stbtt_bakedchar *chardata); // you allocate this, it's num_chars long +// if return is positive, the first unused row of the bitmap +// if return is negative, returns the negative of the number of characters that fit +// if return is 0, no characters fit and no rows were used +// This uses a very crappy packing. + +typedef struct +{ + float x0,y0,s0,t0; // top-left + float x1,y1,s1,t1; // bottom-right +} stbtt_aligned_quad; + +STBTT_DEF void stbtt_GetBakedQuad(const stbtt_bakedchar *chardata, int pw, int ph, // same data as above + int char_index, // character to display + float *xpos, float *ypos, // pointers to current position in screen pixel space + stbtt_aligned_quad *q, // output: quad to draw + int opengl_fillrule); // true if opengl fill rule; false if DX9 or earlier +// Call GetBakedQuad with char_index = 'character - first_char', and it +// creates the quad you need to draw and advances the current position. +// +// The coordinate system used assumes y increases downwards. +// +// Characters will extend both above and below the current position; +// see discussion of "BASELINE" above. +// +// It's inefficient; you might want to c&p it and optimize it. + +STBTT_DEF void stbtt_GetScaledFontVMetrics(const unsigned char *fontdata, int index, float size, float *ascent, float *descent, float *lineGap); +// Query the font vertical metrics without having to create a font first. + + +////////////////////////////////////////////////////////////////////////////// +// +// NEW TEXTURE BAKING API +// +// This provides options for packing multiple fonts into one atlas, not +// perfectly but better than nothing. + +typedef struct +{ + unsigned short x0,y0,x1,y1; // coordinates of bbox in bitmap + float xoff,yoff,xadvance; + float xoff2,yoff2; +} stbtt_packedchar; + +typedef struct stbtt_pack_context stbtt_pack_context; +typedef struct stbtt_fontinfo stbtt_fontinfo; +#ifndef STB_RECT_PACK_VERSION +typedef struct stbrp_rect stbrp_rect; +#endif + +STBTT_DEF int stbtt_PackBegin(stbtt_pack_context *spc, unsigned char *pixels, int width, int height, int stride_in_bytes, int padding, void *alloc_context); +// Initializes a packing context stored in the passed-in stbtt_pack_context. +// Future calls using this context will pack characters into the bitmap passed +// in here: a 1-channel bitmap that is width * height. stride_in_bytes is +// the distance from one row to the next (or 0 to mean they are packed tightly +// together). "padding" is the amount of padding to leave between each +// character (normally you want '1' for bitmaps you'll use as textures with +// bilinear filtering). +// +// Returns 0 on failure, 1 on success. + +STBTT_DEF void stbtt_PackEnd (stbtt_pack_context *spc); +// Cleans up the packing context and frees all memory. + +#define STBTT_POINT_SIZE(x) (-(x)) + +STBTT_DEF int stbtt_PackFontRange(stbtt_pack_context *spc, const unsigned char *fontdata, int font_index, float font_size, + int first_unicode_char_in_range, int num_chars_in_range, stbtt_packedchar *chardata_for_range); +// Creates character bitmaps from the font_index'th font found in fontdata (use +// font_index=0 if you don't know what that is). It creates num_chars_in_range +// bitmaps for characters with unicode values starting at first_unicode_char_in_range +// and increasing. Data for how to render them is stored in chardata_for_range; +// pass these to stbtt_GetPackedQuad to get back renderable quads. +// +// font_size is the full height of the character from ascender to descender, +// as computed by stbtt_ScaleForPixelHeight. To use a point size as computed +// by stbtt_ScaleForMappingEmToPixels, wrap the point size in STBTT_POINT_SIZE() +// and pass that result as 'font_size': +// ..., 20 , ... // font max minus min y is 20 pixels tall +// ..., STBTT_POINT_SIZE(20), ... // 'M' is 20 pixels tall + +typedef struct +{ + float font_size; + int first_unicode_codepoint_in_range; // if non-zero, then the chars are continuous, and this is the first codepoint + int *array_of_unicode_codepoints; // if non-zero, then this is an array of unicode codepoints + int num_chars; + stbtt_packedchar *chardata_for_range; // output + unsigned char h_oversample, v_oversample; // don't set these, they're used internally +} stbtt_pack_range; + +STBTT_DEF int stbtt_PackFontRanges(stbtt_pack_context *spc, const unsigned char *fontdata, int font_index, stbtt_pack_range *ranges, int num_ranges); +// Creates character bitmaps from multiple ranges of characters stored in +// ranges. This will usually create a better-packed bitmap than multiple +// calls to stbtt_PackFontRange. Note that you can call this multiple +// times within a single PackBegin/PackEnd. + +STBTT_DEF void stbtt_PackSetOversampling(stbtt_pack_context *spc, unsigned int h_oversample, unsigned int v_oversample); +// Oversampling a font increases the quality by allowing higher-quality subpixel +// positioning, and is especially valuable at smaller text sizes. +// +// This function sets the amount of oversampling for all following calls to +// stbtt_PackFontRange(s) or stbtt_PackFontRangesGatherRects for a given +// pack context. The default (no oversampling) is achieved by h_oversample=1 +// and v_oversample=1. The total number of pixels required is +// h_oversample*v_oversample larger than the default; for example, 2x2 +// oversampling requires 4x the storage of 1x1. For best results, render +// oversampled textures with bilinear filtering. Look at the readme in +// stb/tests/oversample for information about oversampled fonts +// +// To use with PackFontRangesGather etc., you must set it before calls +// call to PackFontRangesGatherRects. + +STBTT_DEF void stbtt_PackSetSkipMissingCodepoints(stbtt_pack_context *spc, int skip); +// If skip != 0, this tells stb_truetype to skip any codepoints for which +// there is no corresponding glyph. If skip=0, which is the default, then +// codepoints without a glyph recived the font's "missing character" glyph, +// typically an empty box by convention. + +STBTT_DEF void stbtt_GetPackedQuad(const stbtt_packedchar *chardata, int pw, int ph, // same data as above + int char_index, // character to display + float *xpos, float *ypos, // pointers to current position in screen pixel space + stbtt_aligned_quad *q, // output: quad to draw + int align_to_integer); + +STBTT_DEF int stbtt_PackFontRangesGatherRects(stbtt_pack_context *spc, const stbtt_fontinfo *info, stbtt_pack_range *ranges, int num_ranges, stbrp_rect *rects); +STBTT_DEF void stbtt_PackFontRangesPackRects(stbtt_pack_context *spc, stbrp_rect *rects, int num_rects); +STBTT_DEF int stbtt_PackFontRangesRenderIntoRects(stbtt_pack_context *spc, const stbtt_fontinfo *info, stbtt_pack_range *ranges, int num_ranges, stbrp_rect *rects); +// Calling these functions in sequence is roughly equivalent to calling +// stbtt_PackFontRanges(). If you more control over the packing of multiple +// fonts, or if you want to pack custom data into a font texture, take a look +// at the source to of stbtt_PackFontRanges() and create a custom version +// using these functions, e.g. call GatherRects multiple times, +// building up a single array of rects, then call PackRects once, +// then call RenderIntoRects repeatedly. This may result in a +// better packing than calling PackFontRanges multiple times +// (or it may not). + +// this is an opaque structure that you shouldn't mess with which holds +// all the context needed from PackBegin to PackEnd. +struct stbtt_pack_context { + void *user_allocator_context; + void *pack_info; + int width; + int height; + int stride_in_bytes; + int padding; + int skip_missing; + unsigned int h_oversample, v_oversample; + unsigned char *pixels; + void *nodes; +}; + +////////////////////////////////////////////////////////////////////////////// +// +// FONT LOADING +// +// + +STBTT_DEF int stbtt_GetNumberOfFonts(const unsigned char *data); +// This function will determine the number of fonts in a font file. TrueType +// collection (.ttc) files may contain multiple fonts, while TrueType font +// (.ttf) files only contain one font. The number of fonts can be used for +// indexing with the previous function where the index is between zero and one +// less than the total fonts. If an error occurs, -1 is returned. + +STBTT_DEF int stbtt_GetFontOffsetForIndex(const unsigned char *data, int index); +// Each .ttf/.ttc file may have more than one font. Each font has a sequential +// index number starting from 0. Call this function to get the font offset for +// a given index; it returns -1 if the index is out of range. A regular .ttf +// file will only define one font and it always be at offset 0, so it will +// return '0' for index 0, and -1 for all other indices. + +// The following structure is defined publicly so you can declare one on +// the stack or as a global or etc, but you should treat it as opaque. +struct stbtt_fontinfo +{ + void * userdata; + unsigned char * data; // pointer to .ttf file + int fontstart; // offset of start of font + + int numGlyphs; // number of glyphs, needed for range checking + + int loca,head,glyf,hhea,hmtx,kern,gpos,svg; // table locations as offset from start of .ttf + int index_map; // a cmap mapping for our chosen character encoding + int indexToLocFormat; // format needed to map from glyph index to glyph + + stbtt__buf cff; // cff font data + stbtt__buf charstrings; // the charstring index + stbtt__buf gsubrs; // global charstring subroutines index + stbtt__buf subrs; // private charstring subroutines index + stbtt__buf fontdicts; // array of font dicts + stbtt__buf fdselect; // map from glyph to fontdict +}; + +STBTT_DEF int stbtt_InitFont(stbtt_fontinfo *info, const unsigned char *data, int offset); +// Given an offset into the file that defines a font, this function builds +// the necessary cached info for the rest of the system. You must allocate +// the stbtt_fontinfo yourself, and stbtt_InitFont will fill it out. You don't +// need to do anything special to free it, because the contents are pure +// value data with no additional data structures. Returns 0 on failure. + + +////////////////////////////////////////////////////////////////////////////// +// +// CHARACTER TO GLYPH-INDEX CONVERSIOn + +STBTT_DEF int stbtt_FindGlyphIndex(const stbtt_fontinfo *info, int unicode_codepoint); +// If you're going to perform multiple operations on the same character +// and you want a speed-up, call this function with the character you're +// going to process, then use glyph-based functions instead of the +// codepoint-based functions. +// Returns 0 if the character codepoint is not defined in the font. + + +////////////////////////////////////////////////////////////////////////////// +// +// CHARACTER PROPERTIES +// + +STBTT_DEF float stbtt_ScaleForPixelHeight(const stbtt_fontinfo *info, float pixels); +// computes a scale factor to produce a font whose "height" is 'pixels' tall. +// Height is measured as the distance from the highest ascender to the lowest +// descender; in other words, it's equivalent to calling stbtt_GetFontVMetrics +// and computing: +// scale = pixels / (ascent - descent) +// so if you prefer to measure height by the ascent only, use a similar calculation. + +STBTT_DEF float stbtt_ScaleForMappingEmToPixels(const stbtt_fontinfo *info, float pixels); +// computes a scale factor to produce a font whose EM size is mapped to +// 'pixels' tall. This is probably what traditional APIs compute, but +// I'm not positive. + +STBTT_DEF void stbtt_GetFontVMetrics(const stbtt_fontinfo *info, int *ascent, int *descent, int *lineGap); +// ascent is the coordinate above the baseline the font extends; descent +// is the coordinate below the baseline the font extends (i.e. it is typically negative) +// lineGap is the spacing between one row's descent and the next row's ascent... +// so you should advance the vertical position by "*ascent - *descent + *lineGap" +// these are expressed in unscaled coordinates, so you must multiply by +// the scale factor for a given size + +STBTT_DEF int stbtt_GetFontVMetricsOS2(const stbtt_fontinfo *info, int *typoAscent, int *typoDescent, int *typoLineGap); +// analogous to GetFontVMetrics, but returns the "typographic" values from the OS/2 +// table (specific to MS/Windows TTF files). +// +// Returns 1 on success (table present), 0 on failure. + +STBTT_DEF void stbtt_GetFontBoundingBox(const stbtt_fontinfo *info, int *x0, int *y0, int *x1, int *y1); +// the bounding box around all possible characters + +STBTT_DEF void stbtt_GetCodepointHMetrics(const stbtt_fontinfo *info, int codepoint, int *advanceWidth, int *leftSideBearing); +// leftSideBearing is the offset from the current horizontal position to the left edge of the character +// advanceWidth is the offset from the current horizontal position to the next horizontal position +// these are expressed in unscaled coordinates + +STBTT_DEF int stbtt_GetCodepointKernAdvance(const stbtt_fontinfo *info, int ch1, int ch2); +// an additional amount to add to the 'advance' value between ch1 and ch2 + +STBTT_DEF int stbtt_GetCodepointBox(const stbtt_fontinfo *info, int codepoint, int *x0, int *y0, int *x1, int *y1); +// Gets the bounding box of the visible part of the glyph, in unscaled coordinates + +STBTT_DEF void stbtt_GetGlyphHMetrics(const stbtt_fontinfo *info, int glyph_index, int *advanceWidth, int *leftSideBearing); +STBTT_DEF int stbtt_GetGlyphKernAdvance(const stbtt_fontinfo *info, int glyph1, int glyph2); +STBTT_DEF int stbtt_GetGlyphBox(const stbtt_fontinfo *info, int glyph_index, int *x0, int *y0, int *x1, int *y1); +// as above, but takes one or more glyph indices for greater efficiency + +typedef struct stbtt_kerningentry +{ + int glyph1; // use stbtt_FindGlyphIndex + int glyph2; + int advance; +} stbtt_kerningentry; + +STBTT_DEF int stbtt_GetKerningTableLength(const stbtt_fontinfo *info); +STBTT_DEF int stbtt_GetKerningTable(const stbtt_fontinfo *info, stbtt_kerningentry* table, int table_length); +// Retrieves a complete list of all of the kerning pairs provided by the font +// stbtt_GetKerningTable never writes more than table_length entries and returns how many entries it did write. +// The table will be sorted by (a.glyph1 == b.glyph1)?(a.glyph2 < b.glyph2):(a.glyph1 < b.glyph1) + +////////////////////////////////////////////////////////////////////////////// +// +// GLYPH SHAPES (you probably don't need these, but they have to go before +// the bitmaps for C declaration-order reasons) +// + +#ifndef STBTT_vmove // you can predefine these to use different values (but why?) + enum { + STBTT_vmove=1, + STBTT_vline, + STBTT_vcurve, + STBTT_vcubic + }; +#endif + +#ifndef stbtt_vertex // you can predefine this to use different values + // (we share this with other code at RAD) + #define stbtt_vertex_type short // can't use stbtt_int16 because that's not visible in the header file + typedef struct + { + stbtt_vertex_type x,y,cx,cy,cx1,cy1; + unsigned char type,padding; + } stbtt_vertex; +#endif + +STBTT_DEF int stbtt_IsGlyphEmpty(const stbtt_fontinfo *info, int glyph_index); +// returns non-zero if nothing is drawn for this glyph + +STBTT_DEF int stbtt_GetCodepointShape(const stbtt_fontinfo *info, int unicode_codepoint, stbtt_vertex **vertices); +STBTT_DEF int stbtt_GetGlyphShape(const stbtt_fontinfo *info, int glyph_index, stbtt_vertex **vertices); +// returns # of vertices and fills *vertices with the pointer to them +// these are expressed in "unscaled" coordinates +// +// The shape is a series of contours. Each one starts with +// a STBTT_moveto, then consists of a series of mixed +// STBTT_lineto and STBTT_curveto segments. A lineto +// draws a line from previous endpoint to its x,y; a curveto +// draws a quadratic bezier from previous endpoint to +// its x,y, using cx,cy as the bezier control point. + +STBTT_DEF void stbtt_FreeShape(const stbtt_fontinfo *info, stbtt_vertex *vertices); +// frees the data allocated above + +STBTT_DEF unsigned char *stbtt_FindSVGDoc(const stbtt_fontinfo *info, int gl); +STBTT_DEF int stbtt_GetCodepointSVG(const stbtt_fontinfo *info, int unicode_codepoint, const char **svg); +STBTT_DEF int stbtt_GetGlyphSVG(const stbtt_fontinfo *info, int gl, const char **svg); +// fills svg with the character's SVG data. +// returns data size or 0 if SVG not found. + +////////////////////////////////////////////////////////////////////////////// +// +// BITMAP RENDERING +// + +STBTT_DEF void stbtt_FreeBitmap(unsigned char *bitmap, void *userdata); +// frees the bitmap allocated below + +STBTT_DEF unsigned char *stbtt_GetCodepointBitmap(const stbtt_fontinfo *info, float scale_x, float scale_y, int codepoint, int *width, int *height, int *xoff, int *yoff); +// allocates a large-enough single-channel 8bpp bitmap and renders the +// specified character/glyph at the specified scale into it, with +// antialiasing. 0 is no coverage (transparent), 255 is fully covered (opaque). +// *width & *height are filled out with the width & height of the bitmap, +// which is stored left-to-right, top-to-bottom. +// +// xoff/yoff are the offset it pixel space from the glyph origin to the top-left of the bitmap + +STBTT_DEF unsigned char *stbtt_GetCodepointBitmapSubpixel(const stbtt_fontinfo *info, float scale_x, float scale_y, float shift_x, float shift_y, int codepoint, int *width, int *height, int *xoff, int *yoff); +// the same as stbtt_GetCodepoitnBitmap, but you can specify a subpixel +// shift for the character + +STBTT_DEF void stbtt_MakeCodepointBitmap(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, int codepoint); +// the same as stbtt_GetCodepointBitmap, but you pass in storage for the bitmap +// in the form of 'output', with row spacing of 'out_stride' bytes. the bitmap +// is clipped to out_w/out_h bytes. Call stbtt_GetCodepointBitmapBox to get the +// width and height and positioning info for it first. + +STBTT_DEF void stbtt_MakeCodepointBitmapSubpixel(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int codepoint); +// same as stbtt_MakeCodepointBitmap, but you can specify a subpixel +// shift for the character + +STBTT_DEF void stbtt_MakeCodepointBitmapSubpixelPrefilter(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int oversample_x, int oversample_y, float *sub_x, float *sub_y, int codepoint); +// same as stbtt_MakeCodepointBitmapSubpixel, but prefiltering +// is performed (see stbtt_PackSetOversampling) + +STBTT_DEF void stbtt_GetCodepointBitmapBox(const stbtt_fontinfo *font, int codepoint, float scale_x, float scale_y, int *ix0, int *iy0, int *ix1, int *iy1); +// get the bbox of the bitmap centered around the glyph origin; so the +// bitmap width is ix1-ix0, height is iy1-iy0, and location to place +// the bitmap top left is (leftSideBearing*scale,iy0). +// (Note that the bitmap uses y-increases-down, but the shape uses +// y-increases-up, so CodepointBitmapBox and CodepointBox are inverted.) + +STBTT_DEF void stbtt_GetCodepointBitmapBoxSubpixel(const stbtt_fontinfo *font, int codepoint, float scale_x, float scale_y, float shift_x, float shift_y, int *ix0, int *iy0, int *ix1, int *iy1); +// same as stbtt_GetCodepointBitmapBox, but you can specify a subpixel +// shift for the character + +// the following functions are equivalent to the above functions, but operate +// on glyph indices instead of Unicode codepoints (for efficiency) +STBTT_DEF unsigned char *stbtt_GetGlyphBitmap(const stbtt_fontinfo *info, float scale_x, float scale_y, int glyph, int *width, int *height, int *xoff, int *yoff); +STBTT_DEF unsigned char *stbtt_GetGlyphBitmapSubpixel(const stbtt_fontinfo *info, float scale_x, float scale_y, float shift_x, float shift_y, int glyph, int *width, int *height, int *xoff, int *yoff); +STBTT_DEF void stbtt_MakeGlyphBitmap(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, int glyph); +STBTT_DEF void stbtt_MakeGlyphBitmapSubpixel(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int glyph); +STBTT_DEF void stbtt_MakeGlyphBitmapSubpixelPrefilter(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int oversample_x, int oversample_y, float *sub_x, float *sub_y, int glyph); +STBTT_DEF void stbtt_GetGlyphBitmapBox(const stbtt_fontinfo *font, int glyph, float scale_x, float scale_y, int *ix0, int *iy0, int *ix1, int *iy1); +STBTT_DEF void stbtt_GetGlyphBitmapBoxSubpixel(const stbtt_fontinfo *font, int glyph, float scale_x, float scale_y,float shift_x, float shift_y, int *ix0, int *iy0, int *ix1, int *iy1); + + +// @TODO: don't expose this structure +typedef struct +{ + int w,h,stride; + unsigned char *pixels; +} stbtt__bitmap; + +// rasterize a shape with quadratic beziers into a bitmap +STBTT_DEF void stbtt_Rasterize(stbtt__bitmap *result, // 1-channel bitmap to draw into + float flatness_in_pixels, // allowable error of curve in pixels + stbtt_vertex *vertices, // array of vertices defining shape + int num_verts, // number of vertices in above array + float scale_x, float scale_y, // scale applied to input vertices + float shift_x, float shift_y, // translation applied to input vertices + int x_off, int y_off, // another translation applied to input + int invert, // if non-zero, vertically flip shape + void *userdata); // context for to STBTT_MALLOC + +////////////////////////////////////////////////////////////////////////////// +// +// Signed Distance Function (or Field) rendering + +STBTT_DEF void stbtt_FreeSDF(unsigned char *bitmap, void *userdata); +// frees the SDF bitmap allocated below + +STBTT_DEF unsigned char * stbtt_GetGlyphSDF(const stbtt_fontinfo *info, float scale, int glyph, int padding, unsigned char onedge_value, float pixel_dist_scale, int *width, int *height, int *xoff, int *yoff); +STBTT_DEF unsigned char * stbtt_GetCodepointSDF(const stbtt_fontinfo *info, float scale, int codepoint, int padding, unsigned char onedge_value, float pixel_dist_scale, int *width, int *height, int *xoff, int *yoff); +// These functions compute a discretized SDF field for a single character, suitable for storing +// in a single-channel texture, sampling with bilinear filtering, and testing against +// larger than some threshold to produce scalable fonts. +// info -- the font +// scale -- controls the size of the resulting SDF bitmap, same as it would be creating a regular bitmap +// glyph/codepoint -- the character to generate the SDF for +// padding -- extra "pixels" around the character which are filled with the distance to the character (not 0), +// which allows effects like bit outlines +// onedge_value -- value 0-255 to test the SDF against to reconstruct the character (i.e. the isocontour of the character) +// pixel_dist_scale -- what value the SDF should increase by when moving one SDF "pixel" away from the edge (on the 0..255 scale) +// if positive, > onedge_value is inside; if negative, < onedge_value is inside +// width,height -- output height & width of the SDF bitmap (including padding) +// xoff,yoff -- output origin of the character +// return value -- a 2D array of bytes 0..255, width*height in size +// +// pixel_dist_scale & onedge_value are a scale & bias that allows you to make +// optimal use of the limited 0..255 for your application, trading off precision +// and special effects. SDF values outside the range 0..255 are clamped to 0..255. +// +// Example: +// scale = stbtt_ScaleForPixelHeight(22) +// padding = 5 +// onedge_value = 180 +// pixel_dist_scale = 180/5.0 = 36.0 +// +// This will create an SDF bitmap in which the character is about 22 pixels +// high but the whole bitmap is about 22+5+5=32 pixels high. To produce a filled +// shape, sample the SDF at each pixel and fill the pixel if the SDF value +// is greater than or equal to 180/255. (You'll actually want to antialias, +// which is beyond the scope of this example.) Additionally, you can compute +// offset outlines (e.g. to stroke the character border inside & outside, +// or only outside). For example, to fill outside the character up to 3 SDF +// pixels, you would compare against (180-36.0*3)/255 = 72/255. The above +// choice of variables maps a range from 5 pixels outside the shape to +// 2 pixels inside the shape to 0..255; this is intended primarily for apply +// outside effects only (the interior range is needed to allow proper +// antialiasing of the font at *smaller* sizes) +// +// The function computes the SDF analytically at each SDF pixel, not by e.g. +// building a higher-res bitmap and approximating it. In theory the quality +// should be as high as possible for an SDF of this size & representation, but +// unclear if this is true in practice (perhaps building a higher-res bitmap +// and computing from that can allow drop-out prevention). +// +// The algorithm has not been optimized at all, so expect it to be slow +// if computing lots of characters or very large sizes. + + + +////////////////////////////////////////////////////////////////////////////// +// +// Finding the right font... +// +// You should really just solve this offline, keep your own tables +// of what font is what, and don't try to get it out of the .ttf file. +// That's because getting it out of the .ttf file is really hard, because +// the names in the file can appear in many possible encodings, in many +// possible languages, and e.g. if you need a case-insensitive comparison, +// the details of that depend on the encoding & language in a complex way +// (actually underspecified in truetype, but also gigantic). +// +// But you can use the provided functions in two possible ways: +// stbtt_FindMatchingFont() will use *case-sensitive* comparisons on +// unicode-encoded names to try to find the font you want; +// you can run this before calling stbtt_InitFont() +// +// stbtt_GetFontNameString() lets you get any of the various strings +// from the file yourself and do your own comparisons on them. +// You have to have called stbtt_InitFont() first. + + +STBTT_DEF int stbtt_FindMatchingFont(const unsigned char *fontdata, const char *name, int flags); +// returns the offset (not index) of the font that matches, or -1 if none +// if you use STBTT_MACSTYLE_DONTCARE, use a font name like "Arial Bold". +// if you use any other flag, use a font name like "Arial"; this checks +// the 'macStyle' header field; i don't know if fonts set this consistently +#define STBTT_MACSTYLE_DONTCARE 0 +#define STBTT_MACSTYLE_BOLD 1 +#define STBTT_MACSTYLE_ITALIC 2 +#define STBTT_MACSTYLE_UNDERSCORE 4 +#define STBTT_MACSTYLE_NONE 8 // <= not same as 0, this makes us check the bitfield is 0 + +STBTT_DEF int stbtt_CompareUTF8toUTF16_bigendian(const char *s1, int len1, const char *s2, int len2); +// returns 1/0 whether the first string interpreted as utf8 is identical to +// the second string interpreted as big-endian utf16... useful for strings from next func + +STBTT_DEF const char *stbtt_GetFontNameString(const stbtt_fontinfo *font, int *length, int platformID, int encodingID, int languageID, int nameID); +// returns the string (which may be big-endian double byte, e.g. for unicode) +// and puts the length in bytes in *length. +// +// some of the values for the IDs are below; for more see the truetype spec: +// http://developer.apple.com/textfonts/TTRefMan/RM06/Chap6name.html +// http://www.microsoft.com/typography/otspec/name.htm + +enum { // platformID + STBTT_PLATFORM_ID_UNICODE =0, + STBTT_PLATFORM_ID_MAC =1, + STBTT_PLATFORM_ID_ISO =2, + STBTT_PLATFORM_ID_MICROSOFT =3 +}; + +enum { // encodingID for STBTT_PLATFORM_ID_UNICODE + STBTT_UNICODE_EID_UNICODE_1_0 =0, + STBTT_UNICODE_EID_UNICODE_1_1 =1, + STBTT_UNICODE_EID_ISO_10646 =2, + STBTT_UNICODE_EID_UNICODE_2_0_BMP=3, + STBTT_UNICODE_EID_UNICODE_2_0_FULL=4 +}; + +enum { // encodingID for STBTT_PLATFORM_ID_MICROSOFT + STBTT_MS_EID_SYMBOL =0, + STBTT_MS_EID_UNICODE_BMP =1, + STBTT_MS_EID_SHIFTJIS =2, + STBTT_MS_EID_UNICODE_FULL =10 +}; + +enum { // encodingID for STBTT_PLATFORM_ID_MAC; same as Script Manager codes + STBTT_MAC_EID_ROMAN =0, STBTT_MAC_EID_ARABIC =4, + STBTT_MAC_EID_JAPANESE =1, STBTT_MAC_EID_HEBREW =5, + STBTT_MAC_EID_CHINESE_TRAD =2, STBTT_MAC_EID_GREEK =6, + STBTT_MAC_EID_KOREAN =3, STBTT_MAC_EID_RUSSIAN =7 +}; + +enum { // languageID for STBTT_PLATFORM_ID_MICROSOFT; same as LCID... + // problematic because there are e.g. 16 english LCIDs and 16 arabic LCIDs + STBTT_MS_LANG_ENGLISH =0x0409, STBTT_MS_LANG_ITALIAN =0x0410, + STBTT_MS_LANG_CHINESE =0x0804, STBTT_MS_LANG_JAPANESE =0x0411, + STBTT_MS_LANG_DUTCH =0x0413, STBTT_MS_LANG_KOREAN =0x0412, + STBTT_MS_LANG_FRENCH =0x040c, STBTT_MS_LANG_RUSSIAN =0x0419, + STBTT_MS_LANG_GERMAN =0x0407, STBTT_MS_LANG_SPANISH =0x0409, + STBTT_MS_LANG_HEBREW =0x040d, STBTT_MS_LANG_SWEDISH =0x041D +}; + +enum { // languageID for STBTT_PLATFORM_ID_MAC + STBTT_MAC_LANG_ENGLISH =0 , STBTT_MAC_LANG_JAPANESE =11, + STBTT_MAC_LANG_ARABIC =12, STBTT_MAC_LANG_KOREAN =23, + STBTT_MAC_LANG_DUTCH =4 , STBTT_MAC_LANG_RUSSIAN =32, + STBTT_MAC_LANG_FRENCH =1 , STBTT_MAC_LANG_SPANISH =6 , + STBTT_MAC_LANG_GERMAN =2 , STBTT_MAC_LANG_SWEDISH =5 , + STBTT_MAC_LANG_HEBREW =10, STBTT_MAC_LANG_CHINESE_SIMPLIFIED =33, + STBTT_MAC_LANG_ITALIAN =3 , STBTT_MAC_LANG_CHINESE_TRAD =19 +}; + +#ifdef __cplusplus +} +#endif + +#endif // __STB_INCLUDE_STB_TRUETYPE_H__ + +/////////////////////////////////////////////////////////////////////////////// +/////////////////////////////////////////////////////////////////////////////// +//// +//// IMPLEMENTATION +//// +//// + +#ifdef STB_TRUETYPE_IMPLEMENTATION + +#ifndef STBTT_MAX_OVERSAMPLE +#define STBTT_MAX_OVERSAMPLE 8 +#endif + +#if STBTT_MAX_OVERSAMPLE > 255 +#error "STBTT_MAX_OVERSAMPLE cannot be > 255" +#endif + +typedef int stbtt__test_oversample_pow2[(STBTT_MAX_OVERSAMPLE & (STBTT_MAX_OVERSAMPLE-1)) == 0 ? 1 : -1]; + +#ifndef STBTT_RASTERIZER_VERSION +#define STBTT_RASTERIZER_VERSION 2 +#endif + +#ifdef _MSC_VER +#define STBTT__NOTUSED(v) (void)(v) +#else +#define STBTT__NOTUSED(v) (void)sizeof(v) +#endif + +////////////////////////////////////////////////////////////////////////// +// +// stbtt__buf helpers to parse data from file +// + +static stbtt_uint8 stbtt__buf_get8(stbtt__buf *b) +{ + if (b->cursor >= b->size) + return 0; + return b->data[b->cursor++]; +} + +static stbtt_uint8 stbtt__buf_peek8(stbtt__buf *b) +{ + if (b->cursor >= b->size) + return 0; + return b->data[b->cursor]; +} + +static void stbtt__buf_seek(stbtt__buf *b, int o) +{ + STBTT_assert(!(o > b->size || o < 0)); + b->cursor = (o > b->size || o < 0) ? b->size : o; +} + +static void stbtt__buf_skip(stbtt__buf *b, int o) +{ + stbtt__buf_seek(b, b->cursor + o); +} + +static stbtt_uint32 stbtt__buf_get(stbtt__buf *b, int n) +{ + stbtt_uint32 v = 0; + int i; + STBTT_assert(n >= 1 && n <= 4); + for (i = 0; i < n; i++) + v = (v << 8) | stbtt__buf_get8(b); + return v; +} + +static stbtt__buf stbtt__new_buf(const void *p, size_t size) +{ + stbtt__buf r; + STBTT_assert(size < 0x40000000); + r.data = (stbtt_uint8*) p; + r.size = (int) size; + r.cursor = 0; + return r; +} + +#define stbtt__buf_get16(b) stbtt__buf_get((b), 2) +#define stbtt__buf_get32(b) stbtt__buf_get((b), 4) + +static stbtt__buf stbtt__buf_range(const stbtt__buf *b, int o, int s) +{ + stbtt__buf r = stbtt__new_buf(NULL, 0); + if (o < 0 || s < 0 || o > b->size || s > b->size - o) return r; + r.data = b->data + o; + r.size = s; + return r; +} + +static stbtt__buf stbtt__cff_get_index(stbtt__buf *b) +{ + int count, start, offsize; + start = b->cursor; + count = stbtt__buf_get16(b); + if (count) { + offsize = stbtt__buf_get8(b); + STBTT_assert(offsize >= 1 && offsize <= 4); + stbtt__buf_skip(b, offsize * count); + stbtt__buf_skip(b, stbtt__buf_get(b, offsize) - 1); + } + return stbtt__buf_range(b, start, b->cursor - start); +} + +static stbtt_uint32 stbtt__cff_int(stbtt__buf *b) +{ + int b0 = stbtt__buf_get8(b); + if (b0 >= 32 && b0 <= 246) return b0 - 139; + else if (b0 >= 247 && b0 <= 250) return (b0 - 247)*256 + stbtt__buf_get8(b) + 108; + else if (b0 >= 251 && b0 <= 254) return -(b0 - 251)*256 - stbtt__buf_get8(b) - 108; + else if (b0 == 28) return stbtt__buf_get16(b); + else if (b0 == 29) return stbtt__buf_get32(b); + STBTT_assert(0); + return 0; +} + +static void stbtt__cff_skip_operand(stbtt__buf *b) { + int v, b0 = stbtt__buf_peek8(b); + STBTT_assert(b0 >= 28); + if (b0 == 30) { + stbtt__buf_skip(b, 1); + while (b->cursor < b->size) { + v = stbtt__buf_get8(b); + if ((v & 0xF) == 0xF || (v >> 4) == 0xF) + break; + } + } else { + stbtt__cff_int(b); + } +} + +static stbtt__buf stbtt__dict_get(stbtt__buf *b, int key) +{ + stbtt__buf_seek(b, 0); + while (b->cursor < b->size) { + int start = b->cursor, end, op; + while (stbtt__buf_peek8(b) >= 28) + stbtt__cff_skip_operand(b); + end = b->cursor; + op = stbtt__buf_get8(b); + if (op == 12) op = stbtt__buf_get8(b) | 0x100; + if (op == key) return stbtt__buf_range(b, start, end-start); + } + return stbtt__buf_range(b, 0, 0); +} + +static void stbtt__dict_get_ints(stbtt__buf *b, int key, int outcount, stbtt_uint32 *out) +{ + int i; + stbtt__buf operands = stbtt__dict_get(b, key); + for (i = 0; i < outcount && operands.cursor < operands.size; i++) + out[i] = stbtt__cff_int(&operands); +} + +static int stbtt__cff_index_count(stbtt__buf *b) +{ + stbtt__buf_seek(b, 0); + return stbtt__buf_get16(b); +} + +static stbtt__buf stbtt__cff_index_get(stbtt__buf b, int i) +{ + int count, offsize, start, end; + stbtt__buf_seek(&b, 0); + count = stbtt__buf_get16(&b); + offsize = stbtt__buf_get8(&b); + STBTT_assert(i >= 0 && i < count); + STBTT_assert(offsize >= 1 && offsize <= 4); + stbtt__buf_skip(&b, i*offsize); + start = stbtt__buf_get(&b, offsize); + end = stbtt__buf_get(&b, offsize); + return stbtt__buf_range(&b, 2+(count+1)*offsize+start, end - start); +} + +////////////////////////////////////////////////////////////////////////// +// +// accessors to parse data from file +// + +// on platforms that don't allow misaligned reads, if we want to allow +// truetype fonts that aren't padded to alignment, define ALLOW_UNALIGNED_TRUETYPE + +#define ttBYTE(p) (* (stbtt_uint8 *) (p)) +#define ttCHAR(p) (* (stbtt_int8 *) (p)) +#define ttFixed(p) ttLONG(p) + +static stbtt_uint16 ttUSHORT(stbtt_uint8 *p) { return p[0]*256 + p[1]; } +static stbtt_int16 ttSHORT(stbtt_uint8 *p) { return p[0]*256 + p[1]; } +static stbtt_uint32 ttULONG(stbtt_uint8 *p) { return (p[0]<<24) + (p[1]<<16) + (p[2]<<8) + p[3]; } +static stbtt_int32 ttLONG(stbtt_uint8 *p) { return (p[0]<<24) + (p[1]<<16) + (p[2]<<8) + p[3]; } + +#define stbtt_tag4(p,c0,c1,c2,c3) ((p)[0] == (c0) && (p)[1] == (c1) && (p)[2] == (c2) && (p)[3] == (c3)) +#define stbtt_tag(p,str) stbtt_tag4(p,str[0],str[1],str[2],str[3]) + +static int stbtt__isfont(stbtt_uint8 *font) +{ + // check the version number + if (stbtt_tag4(font, '1',0,0,0)) return 1; // TrueType 1 + if (stbtt_tag(font, "typ1")) return 1; // TrueType with type 1 font -- we don't support this! + if (stbtt_tag(font, "OTTO")) return 1; // OpenType with CFF + if (stbtt_tag4(font, 0,1,0,0)) return 1; // OpenType 1.0 + if (stbtt_tag(font, "true")) return 1; // Apple specification for TrueType fonts + return 0; +} + +// @OPTIMIZE: binary search +static stbtt_uint32 stbtt__find_table(stbtt_uint8 *data, stbtt_uint32 fontstart, const char *tag) +{ + stbtt_int32 num_tables = ttUSHORT(data+fontstart+4); + stbtt_uint32 tabledir = fontstart + 12; + stbtt_int32 i; + for (i=0; i < num_tables; ++i) { + stbtt_uint32 loc = tabledir + 16*i; + if (stbtt_tag(data+loc+0, tag)) + return ttULONG(data+loc+8); + } + return 0; +} + +static int stbtt_GetFontOffsetForIndex_internal(unsigned char *font_collection, int index) +{ + // if it's just a font, there's only one valid index + if (stbtt__isfont(font_collection)) + return index == 0 ? 0 : -1; + + // check if it's a TTC + if (stbtt_tag(font_collection, "ttcf")) { + // version 1? + if (ttULONG(font_collection+4) == 0x00010000 || ttULONG(font_collection+4) == 0x00020000) { + stbtt_int32 n = ttLONG(font_collection+8); + if (index >= n) + return -1; + return ttULONG(font_collection+12+index*4); + } + } + return -1; +} + +static int stbtt_GetNumberOfFonts_internal(unsigned char *font_collection) +{ + // if it's just a font, there's only one valid font + if (stbtt__isfont(font_collection)) + return 1; + + // check if it's a TTC + if (stbtt_tag(font_collection, "ttcf")) { + // version 1? + if (ttULONG(font_collection+4) == 0x00010000 || ttULONG(font_collection+4) == 0x00020000) { + return ttLONG(font_collection+8); + } + } + return 0; +} + +static stbtt__buf stbtt__get_subrs(stbtt__buf cff, stbtt__buf fontdict) +{ + stbtt_uint32 subrsoff = 0, private_loc[2] = { 0, 0 }; + stbtt__buf pdict; + stbtt__dict_get_ints(&fontdict, 18, 2, private_loc); + if (!private_loc[1] || !private_loc[0]) return stbtt__new_buf(NULL, 0); + pdict = stbtt__buf_range(&cff, private_loc[1], private_loc[0]); + stbtt__dict_get_ints(&pdict, 19, 1, &subrsoff); + if (!subrsoff) return stbtt__new_buf(NULL, 0); + stbtt__buf_seek(&cff, private_loc[1]+subrsoff); + return stbtt__cff_get_index(&cff); +} + +// since most people won't use this, find this table the first time it's needed +static int stbtt__get_svg(stbtt_fontinfo *info) +{ + stbtt_uint32 t; + if (info->svg < 0) { + t = stbtt__find_table(info->data, info->fontstart, "SVG "); + if (t) { + stbtt_uint32 offset = ttULONG(info->data + t + 2); + info->svg = t + offset; + } else { + info->svg = 0; + } + } + return info->svg; +} + +static int stbtt_InitFont_internal(stbtt_fontinfo *info, unsigned char *data, int fontstart) +{ + stbtt_uint32 cmap, t; + stbtt_int32 i,numTables; + + info->data = data; + info->fontstart = fontstart; + info->cff = stbtt__new_buf(NULL, 0); + + cmap = stbtt__find_table(data, fontstart, "cmap"); // required + info->loca = stbtt__find_table(data, fontstart, "loca"); // required + info->head = stbtt__find_table(data, fontstart, "head"); // required + info->glyf = stbtt__find_table(data, fontstart, "glyf"); // required + info->hhea = stbtt__find_table(data, fontstart, "hhea"); // required + info->hmtx = stbtt__find_table(data, fontstart, "hmtx"); // required + info->kern = stbtt__find_table(data, fontstart, "kern"); // not required + info->gpos = stbtt__find_table(data, fontstart, "GPOS"); // not required + + if (!cmap || !info->head || !info->hhea || !info->hmtx) + return 0; + if (info->glyf) { + // required for truetype + if (!info->loca) return 0; + } else { + // initialization for CFF / Type2 fonts (OTF) + stbtt__buf b, topdict, topdictidx; + stbtt_uint32 cstype = 2, charstrings = 0, fdarrayoff = 0, fdselectoff = 0; + stbtt_uint32 cff; + + cff = stbtt__find_table(data, fontstart, "CFF "); + if (!cff) return 0; + + info->fontdicts = stbtt__new_buf(NULL, 0); + info->fdselect = stbtt__new_buf(NULL, 0); + + // @TODO this should use size from table (not 512MB) + info->cff = stbtt__new_buf(data+cff, 512*1024*1024); + b = info->cff; + + // read the header + stbtt__buf_skip(&b, 2); + stbtt__buf_seek(&b, stbtt__buf_get8(&b)); // hdrsize + + // @TODO the name INDEX could list multiple fonts, + // but we just use the first one. + stbtt__cff_get_index(&b); // name INDEX + topdictidx = stbtt__cff_get_index(&b); + topdict = stbtt__cff_index_get(topdictidx, 0); + stbtt__cff_get_index(&b); // string INDEX + info->gsubrs = stbtt__cff_get_index(&b); + + stbtt__dict_get_ints(&topdict, 17, 1, &charstrings); + stbtt__dict_get_ints(&topdict, 0x100 | 6, 1, &cstype); + stbtt__dict_get_ints(&topdict, 0x100 | 36, 1, &fdarrayoff); + stbtt__dict_get_ints(&topdict, 0x100 | 37, 1, &fdselectoff); + info->subrs = stbtt__get_subrs(b, topdict); + + // we only support Type 2 charstrings + if (cstype != 2) return 0; + if (charstrings == 0) return 0; + + if (fdarrayoff) { + // looks like a CID font + if (!fdselectoff) return 0; + stbtt__buf_seek(&b, fdarrayoff); + info->fontdicts = stbtt__cff_get_index(&b); + info->fdselect = stbtt__buf_range(&b, fdselectoff, b.size-fdselectoff); + } + + stbtt__buf_seek(&b, charstrings); + info->charstrings = stbtt__cff_get_index(&b); + } + + t = stbtt__find_table(data, fontstart, "maxp"); + if (t) + info->numGlyphs = ttUSHORT(data+t+4); + else + info->numGlyphs = 0xffff; + + info->svg = -1; + + // find a cmap encoding table we understand *now* to avoid searching + // later. (todo: could make this installable) + // the same regardless of glyph. + numTables = ttUSHORT(data + cmap + 2); + info->index_map = 0; + for (i=0; i < numTables; ++i) { + stbtt_uint32 encoding_record = cmap + 4 + 8 * i; + // find an encoding we understand: + switch(ttUSHORT(data+encoding_record)) { + case STBTT_PLATFORM_ID_MICROSOFT: + switch (ttUSHORT(data+encoding_record+2)) { + case STBTT_MS_EID_UNICODE_BMP: + case STBTT_MS_EID_UNICODE_FULL: + // MS/Unicode + info->index_map = cmap + ttULONG(data+encoding_record+4); + break; + } + break; + case STBTT_PLATFORM_ID_UNICODE: + // Mac/iOS has these + // all the encodingIDs are unicode, so we don't bother to check it + info->index_map = cmap + ttULONG(data+encoding_record+4); + break; + } + } + if (info->index_map == 0) + return 0; + + info->indexToLocFormat = ttUSHORT(data+info->head + 50); + return 1; +} + +STBTT_DEF int stbtt_FindGlyphIndex(const stbtt_fontinfo *info, int unicode_codepoint) +{ + stbtt_uint8 *data = info->data; + stbtt_uint32 index_map = info->index_map; + + stbtt_uint16 format = ttUSHORT(data + index_map + 0); + if (format == 0) { // apple byte encoding + stbtt_int32 bytes = ttUSHORT(data + index_map + 2); + if (unicode_codepoint < bytes-6) + return ttBYTE(data + index_map + 6 + unicode_codepoint); + return 0; + } else if (format == 6) { + stbtt_uint32 first = ttUSHORT(data + index_map + 6); + stbtt_uint32 count = ttUSHORT(data + index_map + 8); + if ((stbtt_uint32) unicode_codepoint >= first && (stbtt_uint32) unicode_codepoint < first+count) + return ttUSHORT(data + index_map + 10 + (unicode_codepoint - first)*2); + return 0; + } else if (format == 2) { + STBTT_assert(0); // @TODO: high-byte mapping for japanese/chinese/korean + return 0; + } else if (format == 4) { // standard mapping for windows fonts: binary search collection of ranges + stbtt_uint16 segcount = ttUSHORT(data+index_map+6) >> 1; + stbtt_uint16 searchRange = ttUSHORT(data+index_map+8) >> 1; + stbtt_uint16 entrySelector = ttUSHORT(data+index_map+10); + stbtt_uint16 rangeShift = ttUSHORT(data+index_map+12) >> 1; + + // do a binary search of the segments + stbtt_uint32 endCount = index_map + 14; + stbtt_uint32 search = endCount; + + if (unicode_codepoint > 0xffff) + return 0; + + // they lie from endCount .. endCount + segCount + // but searchRange is the nearest power of two, so... + if (unicode_codepoint >= ttUSHORT(data + search + rangeShift*2)) + search += rangeShift*2; + + // now decrement to bias correctly to find smallest + search -= 2; + while (entrySelector) { + stbtt_uint16 end; + searchRange >>= 1; + end = ttUSHORT(data + search + searchRange*2); + if (unicode_codepoint > end) + search += searchRange*2; + --entrySelector; + } + search += 2; + + { + stbtt_uint16 offset, start, last; + stbtt_uint16 item = (stbtt_uint16) ((search - endCount) >> 1); + + start = ttUSHORT(data + index_map + 14 + segcount*2 + 2 + 2*item); + last = ttUSHORT(data + endCount + 2*item); + if (unicode_codepoint < start || unicode_codepoint > last) + return 0; + + offset = ttUSHORT(data + index_map + 14 + segcount*6 + 2 + 2*item); + if (offset == 0) + return (stbtt_uint16) (unicode_codepoint + ttSHORT(data + index_map + 14 + segcount*4 + 2 + 2*item)); + + return ttUSHORT(data + offset + (unicode_codepoint-start)*2 + index_map + 14 + segcount*6 + 2 + 2*item); + } + } else if (format == 12 || format == 13) { + stbtt_uint32 ngroups = ttULONG(data+index_map+12); + stbtt_int32 low,high; + low = 0; high = (stbtt_int32)ngroups; + // Binary search the right group. + while (low < high) { + stbtt_int32 mid = low + ((high-low) >> 1); // rounds down, so low <= mid < high + stbtt_uint32 start_char = ttULONG(data+index_map+16+mid*12); + stbtt_uint32 end_char = ttULONG(data+index_map+16+mid*12+4); + if ((stbtt_uint32) unicode_codepoint < start_char) + high = mid; + else if ((stbtt_uint32) unicode_codepoint > end_char) + low = mid+1; + else { + stbtt_uint32 start_glyph = ttULONG(data+index_map+16+mid*12+8); + if (format == 12) + return start_glyph + unicode_codepoint-start_char; + else // format == 13 + return start_glyph; + } + } + return 0; // not found + } + // @TODO + STBTT_assert(0); + return 0; +} + +STBTT_DEF int stbtt_GetCodepointShape(const stbtt_fontinfo *info, int unicode_codepoint, stbtt_vertex **vertices) +{ + return stbtt_GetGlyphShape(info, stbtt_FindGlyphIndex(info, unicode_codepoint), vertices); +} + +static void stbtt_setvertex(stbtt_vertex *v, stbtt_uint8 type, stbtt_int32 x, stbtt_int32 y, stbtt_int32 cx, stbtt_int32 cy) +{ + v->type = type; + v->x = (stbtt_int16) x; + v->y = (stbtt_int16) y; + v->cx = (stbtt_int16) cx; + v->cy = (stbtt_int16) cy; +} + +static int stbtt__GetGlyfOffset(const stbtt_fontinfo *info, int glyph_index) +{ + int g1,g2; + + STBTT_assert(!info->cff.size); + + if (glyph_index >= info->numGlyphs) return -1; // glyph index out of range + if (info->indexToLocFormat >= 2) return -1; // unknown index->glyph map format + + if (info->indexToLocFormat == 0) { + g1 = info->glyf + ttUSHORT(info->data + info->loca + glyph_index * 2) * 2; + g2 = info->glyf + ttUSHORT(info->data + info->loca + glyph_index * 2 + 2) * 2; + } else { + g1 = info->glyf + ttULONG (info->data + info->loca + glyph_index * 4); + g2 = info->glyf + ttULONG (info->data + info->loca + glyph_index * 4 + 4); + } + + return g1==g2 ? -1 : g1; // if length is 0, return -1 +} + +static int stbtt__GetGlyphInfoT2(const stbtt_fontinfo *info, int glyph_index, int *x0, int *y0, int *x1, int *y1); + +STBTT_DEF int stbtt_GetGlyphBox(const stbtt_fontinfo *info, int glyph_index, int *x0, int *y0, int *x1, int *y1) +{ + if (info->cff.size) { + stbtt__GetGlyphInfoT2(info, glyph_index, x0, y0, x1, y1); + } else { + int g = stbtt__GetGlyfOffset(info, glyph_index); + if (g < 0) return 0; + + if (x0) *x0 = ttSHORT(info->data + g + 2); + if (y0) *y0 = ttSHORT(info->data + g + 4); + if (x1) *x1 = ttSHORT(info->data + g + 6); + if (y1) *y1 = ttSHORT(info->data + g + 8); + } + return 1; +} + +STBTT_DEF int stbtt_GetCodepointBox(const stbtt_fontinfo *info, int codepoint, int *x0, int *y0, int *x1, int *y1) +{ + return stbtt_GetGlyphBox(info, stbtt_FindGlyphIndex(info,codepoint), x0,y0,x1,y1); +} + +STBTT_DEF int stbtt_IsGlyphEmpty(const stbtt_fontinfo *info, int glyph_index) +{ + stbtt_int16 numberOfContours; + int g; + if (info->cff.size) + return stbtt__GetGlyphInfoT2(info, glyph_index, NULL, NULL, NULL, NULL) == 0; + g = stbtt__GetGlyfOffset(info, glyph_index); + if (g < 0) return 1; + numberOfContours = ttSHORT(info->data + g); + return numberOfContours == 0; +} + +static int stbtt__close_shape(stbtt_vertex *vertices, int num_vertices, int was_off, int start_off, + stbtt_int32 sx, stbtt_int32 sy, stbtt_int32 scx, stbtt_int32 scy, stbtt_int32 cx, stbtt_int32 cy) +{ + if (start_off) { + if (was_off) + stbtt_setvertex(&vertices[num_vertices++], STBTT_vcurve, (cx+scx)>>1, (cy+scy)>>1, cx,cy); + stbtt_setvertex(&vertices[num_vertices++], STBTT_vcurve, sx,sy,scx,scy); + } else { + if (was_off) + stbtt_setvertex(&vertices[num_vertices++], STBTT_vcurve,sx,sy,cx,cy); + else + stbtt_setvertex(&vertices[num_vertices++], STBTT_vline,sx,sy,0,0); + } + return num_vertices; +} + +static int stbtt__GetGlyphShapeTT(const stbtt_fontinfo *info, int glyph_index, stbtt_vertex **pvertices) +{ + stbtt_int16 numberOfContours; + stbtt_uint8 *endPtsOfContours; + stbtt_uint8 *data = info->data; + stbtt_vertex *vertices=0; + int num_vertices=0; + int g = stbtt__GetGlyfOffset(info, glyph_index); + + *pvertices = NULL; + + if (g < 0) return 0; + + numberOfContours = ttSHORT(data + g); + + if (numberOfContours > 0) { + stbtt_uint8 flags=0,flagcount; + stbtt_int32 ins, i,j=0,m,n, next_move, was_off=0, off, start_off=0; + stbtt_int32 x,y,cx,cy,sx,sy, scx,scy; + stbtt_uint8 *points; + endPtsOfContours = (data + g + 10); + ins = ttUSHORT(data + g + 10 + numberOfContours * 2); + points = data + g + 10 + numberOfContours * 2 + 2 + ins; + + n = 1+ttUSHORT(endPtsOfContours + numberOfContours*2-2); + + m = n + 2*numberOfContours; // a loose bound on how many vertices we might need + vertices = (stbtt_vertex *) STBTT_malloc(m * sizeof(vertices[0]), info->userdata); + if (vertices == 0) + return 0; + + next_move = 0; + flagcount=0; + + // in first pass, we load uninterpreted data into the allocated array + // above, shifted to the end of the array so we won't overwrite it when + // we create our final data starting from the front + + off = m - n; // starting offset for uninterpreted data, regardless of how m ends up being calculated + + // first load flags + + for (i=0; i < n; ++i) { + if (flagcount == 0) { + flags = *points++; + if (flags & 8) + flagcount = *points++; + } else + --flagcount; + vertices[off+i].type = flags; + } + + // now load x coordinates + x=0; + for (i=0; i < n; ++i) { + flags = vertices[off+i].type; + if (flags & 2) { + stbtt_int16 dx = *points++; + x += (flags & 16) ? dx : -dx; // ??? + } else { + if (!(flags & 16)) { + x = x + (stbtt_int16) (points[0]*256 + points[1]); + points += 2; + } + } + vertices[off+i].x = (stbtt_int16) x; + } + + // now load y coordinates + y=0; + for (i=0; i < n; ++i) { + flags = vertices[off+i].type; + if (flags & 4) { + stbtt_int16 dy = *points++; + y += (flags & 32) ? dy : -dy; // ??? + } else { + if (!(flags & 32)) { + y = y + (stbtt_int16) (points[0]*256 + points[1]); + points += 2; + } + } + vertices[off+i].y = (stbtt_int16) y; + } + + // now convert them to our format + num_vertices=0; + sx = sy = cx = cy = scx = scy = 0; + for (i=0; i < n; ++i) { + flags = vertices[off+i].type; + x = (stbtt_int16) vertices[off+i].x; + y = (stbtt_int16) vertices[off+i].y; + + if (next_move == i) { + if (i != 0) + num_vertices = stbtt__close_shape(vertices, num_vertices, was_off, start_off, sx,sy,scx,scy,cx,cy); + + // now start the new one + start_off = !(flags & 1); + if (start_off) { + // if we start off with an off-curve point, then when we need to find a point on the curve + // where we can start, and we need to save some state for when we wraparound. + scx = x; + scy = y; + if (!(vertices[off+i+1].type & 1)) { + // next point is also a curve point, so interpolate an on-point curve + sx = (x + (stbtt_int32) vertices[off+i+1].x) >> 1; + sy = (y + (stbtt_int32) vertices[off+i+1].y) >> 1; + } else { + // otherwise just use the next point as our start point + sx = (stbtt_int32) vertices[off+i+1].x; + sy = (stbtt_int32) vertices[off+i+1].y; + ++i; // we're using point i+1 as the starting point, so skip it + } + } else { + sx = x; + sy = y; + } + stbtt_setvertex(&vertices[num_vertices++], STBTT_vmove,sx,sy,0,0); + was_off = 0; + next_move = 1 + ttUSHORT(endPtsOfContours+j*2); + ++j; + } else { + if (!(flags & 1)) { // if it's a curve + if (was_off) // two off-curve control points in a row means interpolate an on-curve midpoint + stbtt_setvertex(&vertices[num_vertices++], STBTT_vcurve, (cx+x)>>1, (cy+y)>>1, cx, cy); + cx = x; + cy = y; + was_off = 1; + } else { + if (was_off) + stbtt_setvertex(&vertices[num_vertices++], STBTT_vcurve, x,y, cx, cy); + else + stbtt_setvertex(&vertices[num_vertices++], STBTT_vline, x,y,0,0); + was_off = 0; + } + } + } + num_vertices = stbtt__close_shape(vertices, num_vertices, was_off, start_off, sx,sy,scx,scy,cx,cy); + } else if (numberOfContours < 0) { + // Compound shapes. + int more = 1; + stbtt_uint8 *comp = data + g + 10; + num_vertices = 0; + vertices = 0; + while (more) { + stbtt_uint16 flags, gidx; + int comp_num_verts = 0, i; + stbtt_vertex *comp_verts = 0, *tmp = 0; + float mtx[6] = {1,0,0,1,0,0}, m, n; + + flags = ttSHORT(comp); comp+=2; + gidx = ttSHORT(comp); comp+=2; + + if (flags & 2) { // XY values + if (flags & 1) { // shorts + mtx[4] = ttSHORT(comp); comp+=2; + mtx[5] = ttSHORT(comp); comp+=2; + } else { + mtx[4] = ttCHAR(comp); comp+=1; + mtx[5] = ttCHAR(comp); comp+=1; + } + } + else { + // @TODO handle matching point + STBTT_assert(0); + } + if (flags & (1<<3)) { // WE_HAVE_A_SCALE + mtx[0] = mtx[3] = ttSHORT(comp)/16384.0f; comp+=2; + mtx[1] = mtx[2] = 0; + } else if (flags & (1<<6)) { // WE_HAVE_AN_X_AND_YSCALE + mtx[0] = ttSHORT(comp)/16384.0f; comp+=2; + mtx[1] = mtx[2] = 0; + mtx[3] = ttSHORT(comp)/16384.0f; comp+=2; + } else if (flags & (1<<7)) { // WE_HAVE_A_TWO_BY_TWO + mtx[0] = ttSHORT(comp)/16384.0f; comp+=2; + mtx[1] = ttSHORT(comp)/16384.0f; comp+=2; + mtx[2] = ttSHORT(comp)/16384.0f; comp+=2; + mtx[3] = ttSHORT(comp)/16384.0f; comp+=2; + } + + // Find transformation scales. + m = (float) STBTT_sqrt(mtx[0]*mtx[0] + mtx[1]*mtx[1]); + n = (float) STBTT_sqrt(mtx[2]*mtx[2] + mtx[3]*mtx[3]); + + // Get indexed glyph. + comp_num_verts = stbtt_GetGlyphShape(info, gidx, &comp_verts); + if (comp_num_verts > 0) { + // Transform vertices. + for (i = 0; i < comp_num_verts; ++i) { + stbtt_vertex* v = &comp_verts[i]; + stbtt_vertex_type x,y; + x=v->x; y=v->y; + v->x = (stbtt_vertex_type)(m * (mtx[0]*x + mtx[2]*y + mtx[4])); + v->y = (stbtt_vertex_type)(n * (mtx[1]*x + mtx[3]*y + mtx[5])); + x=v->cx; y=v->cy; + v->cx = (stbtt_vertex_type)(m * (mtx[0]*x + mtx[2]*y + mtx[4])); + v->cy = (stbtt_vertex_type)(n * (mtx[1]*x + mtx[3]*y + mtx[5])); + } + // Append vertices. + tmp = (stbtt_vertex*)STBTT_malloc((num_vertices+comp_num_verts)*sizeof(stbtt_vertex), info->userdata); + if (!tmp) { + if (vertices) STBTT_free(vertices, info->userdata); + if (comp_verts) STBTT_free(comp_verts, info->userdata); + return 0; + } + if (num_vertices > 0 && vertices) STBTT_memcpy(tmp, vertices, num_vertices*sizeof(stbtt_vertex)); + STBTT_memcpy(tmp+num_vertices, comp_verts, comp_num_verts*sizeof(stbtt_vertex)); + if (vertices) STBTT_free(vertices, info->userdata); + vertices = tmp; + STBTT_free(comp_verts, info->userdata); + num_vertices += comp_num_verts; + } + // More components ? + more = flags & (1<<5); + } + } else { + // numberOfCounters == 0, do nothing + } + + *pvertices = vertices; + return num_vertices; +} + +typedef struct +{ + int bounds; + int started; + float first_x, first_y; + float x, y; + stbtt_int32 min_x, max_x, min_y, max_y; + + stbtt_vertex *pvertices; + int num_vertices; +} stbtt__csctx; + +#define STBTT__CSCTX_INIT(bounds) {bounds,0, 0,0, 0,0, 0,0,0,0, NULL, 0} + +static void stbtt__track_vertex(stbtt__csctx *c, stbtt_int32 x, stbtt_int32 y) +{ + if (x > c->max_x || !c->started) c->max_x = x; + if (y > c->max_y || !c->started) c->max_y = y; + if (x < c->min_x || !c->started) c->min_x = x; + if (y < c->min_y || !c->started) c->min_y = y; + c->started = 1; +} + +static void stbtt__csctx_v(stbtt__csctx *c, stbtt_uint8 type, stbtt_int32 x, stbtt_int32 y, stbtt_int32 cx, stbtt_int32 cy, stbtt_int32 cx1, stbtt_int32 cy1) +{ + if (c->bounds) { + stbtt__track_vertex(c, x, y); + if (type == STBTT_vcubic) { + stbtt__track_vertex(c, cx, cy); + stbtt__track_vertex(c, cx1, cy1); + } + } else { + stbtt_setvertex(&c->pvertices[c->num_vertices], type, x, y, cx, cy); + c->pvertices[c->num_vertices].cx1 = (stbtt_int16) cx1; + c->pvertices[c->num_vertices].cy1 = (stbtt_int16) cy1; + } + c->num_vertices++; +} + +static void stbtt__csctx_close_shape(stbtt__csctx *ctx) +{ + if (ctx->first_x != ctx->x || ctx->first_y != ctx->y) + stbtt__csctx_v(ctx, STBTT_vline, (int)ctx->first_x, (int)ctx->first_y, 0, 0, 0, 0); +} + +static void stbtt__csctx_rmove_to(stbtt__csctx *ctx, float dx, float dy) +{ + stbtt__csctx_close_shape(ctx); + ctx->first_x = ctx->x = ctx->x + dx; + ctx->first_y = ctx->y = ctx->y + dy; + stbtt__csctx_v(ctx, STBTT_vmove, (int)ctx->x, (int)ctx->y, 0, 0, 0, 0); +} + +static void stbtt__csctx_rline_to(stbtt__csctx *ctx, float dx, float dy) +{ + ctx->x += dx; + ctx->y += dy; + stbtt__csctx_v(ctx, STBTT_vline, (int)ctx->x, (int)ctx->y, 0, 0, 0, 0); +} + +static void stbtt__csctx_rccurve_to(stbtt__csctx *ctx, float dx1, float dy1, float dx2, float dy2, float dx3, float dy3) +{ + float cx1 = ctx->x + dx1; + float cy1 = ctx->y + dy1; + float cx2 = cx1 + dx2; + float cy2 = cy1 + dy2; + ctx->x = cx2 + dx3; + ctx->y = cy2 + dy3; + stbtt__csctx_v(ctx, STBTT_vcubic, (int)ctx->x, (int)ctx->y, (int)cx1, (int)cy1, (int)cx2, (int)cy2); +} + +static stbtt__buf stbtt__get_subr(stbtt__buf idx, int n) +{ + int count = stbtt__cff_index_count(&idx); + int bias = 107; + if (count >= 33900) + bias = 32768; + else if (count >= 1240) + bias = 1131; + n += bias; + if (n < 0 || n >= count) + return stbtt__new_buf(NULL, 0); + return stbtt__cff_index_get(idx, n); +} + +static stbtt__buf stbtt__cid_get_glyph_subrs(const stbtt_fontinfo *info, int glyph_index) +{ + stbtt__buf fdselect = info->fdselect; + int nranges, start, end, v, fmt, fdselector = -1, i; + + stbtt__buf_seek(&fdselect, 0); + fmt = stbtt__buf_get8(&fdselect); + if (fmt == 0) { + // untested + stbtt__buf_skip(&fdselect, glyph_index); + fdselector = stbtt__buf_get8(&fdselect); + } else if (fmt == 3) { + nranges = stbtt__buf_get16(&fdselect); + start = stbtt__buf_get16(&fdselect); + for (i = 0; i < nranges; i++) { + v = stbtt__buf_get8(&fdselect); + end = stbtt__buf_get16(&fdselect); + if (glyph_index >= start && glyph_index < end) { + fdselector = v; + break; + } + start = end; + } + } + if (fdselector == -1) stbtt__new_buf(NULL, 0); + return stbtt__get_subrs(info->cff, stbtt__cff_index_get(info->fontdicts, fdselector)); +} + +static int stbtt__run_charstring(const stbtt_fontinfo *info, int glyph_index, stbtt__csctx *c) +{ + int in_header = 1, maskbits = 0, subr_stack_height = 0, sp = 0, v, i, b0; + int has_subrs = 0, clear_stack; + float s[48]; + stbtt__buf subr_stack[10], subrs = info->subrs, b; + float f; + +#define STBTT__CSERR(s) (0) + + // this currently ignores the initial width value, which isn't needed if we have hmtx + b = stbtt__cff_index_get(info->charstrings, glyph_index); + while (b.cursor < b.size) { + i = 0; + clear_stack = 1; + b0 = stbtt__buf_get8(&b); + switch (b0) { + // @TODO implement hinting + case 0x13: // hintmask + case 0x14: // cntrmask + if (in_header) + maskbits += (sp / 2); // implicit "vstem" + in_header = 0; + stbtt__buf_skip(&b, (maskbits + 7) / 8); + break; + + case 0x01: // hstem + case 0x03: // vstem + case 0x12: // hstemhm + case 0x17: // vstemhm + maskbits += (sp / 2); + break; + + case 0x15: // rmoveto + in_header = 0; + if (sp < 2) return STBTT__CSERR("rmoveto stack"); + stbtt__csctx_rmove_to(c, s[sp-2], s[sp-1]); + break; + case 0x04: // vmoveto + in_header = 0; + if (sp < 1) return STBTT__CSERR("vmoveto stack"); + stbtt__csctx_rmove_to(c, 0, s[sp-1]); + break; + case 0x16: // hmoveto + in_header = 0; + if (sp < 1) return STBTT__CSERR("hmoveto stack"); + stbtt__csctx_rmove_to(c, s[sp-1], 0); + break; + + case 0x05: // rlineto + if (sp < 2) return STBTT__CSERR("rlineto stack"); + for (; i + 1 < sp; i += 2) + stbtt__csctx_rline_to(c, s[i], s[i+1]); + break; + + // hlineto/vlineto and vhcurveto/hvcurveto alternate horizontal and vertical + // starting from a different place. + + case 0x07: // vlineto + if (sp < 1) return STBTT__CSERR("vlineto stack"); + goto vlineto; + case 0x06: // hlineto + if (sp < 1) return STBTT__CSERR("hlineto stack"); + for (;;) { + if (i >= sp) break; + stbtt__csctx_rline_to(c, s[i], 0); + i++; + vlineto: + if (i >= sp) break; + stbtt__csctx_rline_to(c, 0, s[i]); + i++; + } + break; + + case 0x1F: // hvcurveto + if (sp < 4) return STBTT__CSERR("hvcurveto stack"); + goto hvcurveto; + case 0x1E: // vhcurveto + if (sp < 4) return STBTT__CSERR("vhcurveto stack"); + for (;;) { + if (i + 3 >= sp) break; + stbtt__csctx_rccurve_to(c, 0, s[i], s[i+1], s[i+2], s[i+3], (sp - i == 5) ? s[i + 4] : 0.0f); + i += 4; + hvcurveto: + if (i + 3 >= sp) break; + stbtt__csctx_rccurve_to(c, s[i], 0, s[i+1], s[i+2], (sp - i == 5) ? s[i+4] : 0.0f, s[i+3]); + i += 4; + } + break; + + case 0x08: // rrcurveto + if (sp < 6) return STBTT__CSERR("rcurveline stack"); + for (; i + 5 < sp; i += 6) + stbtt__csctx_rccurve_to(c, s[i], s[i+1], s[i+2], s[i+3], s[i+4], s[i+5]); + break; + + case 0x18: // rcurveline + if (sp < 8) return STBTT__CSERR("rcurveline stack"); + for (; i + 5 < sp - 2; i += 6) + stbtt__csctx_rccurve_to(c, s[i], s[i+1], s[i+2], s[i+3], s[i+4], s[i+5]); + if (i + 1 >= sp) return STBTT__CSERR("rcurveline stack"); + stbtt__csctx_rline_to(c, s[i], s[i+1]); + break; + + case 0x19: // rlinecurve + if (sp < 8) return STBTT__CSERR("rlinecurve stack"); + for (; i + 1 < sp - 6; i += 2) + stbtt__csctx_rline_to(c, s[i], s[i+1]); + if (i + 5 >= sp) return STBTT__CSERR("rlinecurve stack"); + stbtt__csctx_rccurve_to(c, s[i], s[i+1], s[i+2], s[i+3], s[i+4], s[i+5]); + break; + + case 0x1A: // vvcurveto + case 0x1B: // hhcurveto + if (sp < 4) return STBTT__CSERR("(vv|hh)curveto stack"); + f = 0.0; + if (sp & 1) { f = s[i]; i++; } + for (; i + 3 < sp; i += 4) { + if (b0 == 0x1B) + stbtt__csctx_rccurve_to(c, s[i], f, s[i+1], s[i+2], s[i+3], 0.0); + else + stbtt__csctx_rccurve_to(c, f, s[i], s[i+1], s[i+2], 0.0, s[i+3]); + f = 0.0; + } + break; + + case 0x0A: // callsubr + if (!has_subrs) { + if (info->fdselect.size) + subrs = stbtt__cid_get_glyph_subrs(info, glyph_index); + has_subrs = 1; + } + // FALLTHROUGH + case 0x1D: // callgsubr + if (sp < 1) return STBTT__CSERR("call(g|)subr stack"); + v = (int) s[--sp]; + if (subr_stack_height >= 10) return STBTT__CSERR("recursion limit"); + subr_stack[subr_stack_height++] = b; + b = stbtt__get_subr(b0 == 0x0A ? subrs : info->gsubrs, v); + if (b.size == 0) return STBTT__CSERR("subr not found"); + b.cursor = 0; + clear_stack = 0; + break; + + case 0x0B: // return + if (subr_stack_height <= 0) return STBTT__CSERR("return outside subr"); + b = subr_stack[--subr_stack_height]; + clear_stack = 0; + break; + + case 0x0E: // endchar + stbtt__csctx_close_shape(c); + return 1; + + case 0x0C: { // two-byte escape + float dx1, dx2, dx3, dx4, dx5, dx6, dy1, dy2, dy3, dy4, dy5, dy6; + float dx, dy; + int b1 = stbtt__buf_get8(&b); + switch (b1) { + // @TODO These "flex" implementations ignore the flex-depth and resolution, + // and always draw beziers. + case 0x22: // hflex + if (sp < 7) return STBTT__CSERR("hflex stack"); + dx1 = s[0]; + dx2 = s[1]; + dy2 = s[2]; + dx3 = s[3]; + dx4 = s[4]; + dx5 = s[5]; + dx6 = s[6]; + stbtt__csctx_rccurve_to(c, dx1, 0, dx2, dy2, dx3, 0); + stbtt__csctx_rccurve_to(c, dx4, 0, dx5, -dy2, dx6, 0); + break; + + case 0x23: // flex + if (sp < 13) return STBTT__CSERR("flex stack"); + dx1 = s[0]; + dy1 = s[1]; + dx2 = s[2]; + dy2 = s[3]; + dx3 = s[4]; + dy3 = s[5]; + dx4 = s[6]; + dy4 = s[7]; + dx5 = s[8]; + dy5 = s[9]; + dx6 = s[10]; + dy6 = s[11]; + //fd is s[12] + stbtt__csctx_rccurve_to(c, dx1, dy1, dx2, dy2, dx3, dy3); + stbtt__csctx_rccurve_to(c, dx4, dy4, dx5, dy5, dx6, dy6); + break; + + case 0x24: // hflex1 + if (sp < 9) return STBTT__CSERR("hflex1 stack"); + dx1 = s[0]; + dy1 = s[1]; + dx2 = s[2]; + dy2 = s[3]; + dx3 = s[4]; + dx4 = s[5]; + dx5 = s[6]; + dy5 = s[7]; + dx6 = s[8]; + stbtt__csctx_rccurve_to(c, dx1, dy1, dx2, dy2, dx3, 0); + stbtt__csctx_rccurve_to(c, dx4, 0, dx5, dy5, dx6, -(dy1+dy2+dy5)); + break; + + case 0x25: // flex1 + if (sp < 11) return STBTT__CSERR("flex1 stack"); + dx1 = s[0]; + dy1 = s[1]; + dx2 = s[2]; + dy2 = s[3]; + dx3 = s[4]; + dy3 = s[5]; + dx4 = s[6]; + dy4 = s[7]; + dx5 = s[8]; + dy5 = s[9]; + dx6 = dy6 = s[10]; + dx = dx1+dx2+dx3+dx4+dx5; + dy = dy1+dy2+dy3+dy4+dy5; + if (STBTT_fabs(dx) > STBTT_fabs(dy)) + dy6 = -dy; + else + dx6 = -dx; + stbtt__csctx_rccurve_to(c, dx1, dy1, dx2, dy2, dx3, dy3); + stbtt__csctx_rccurve_to(c, dx4, dy4, dx5, dy5, dx6, dy6); + break; + + default: + return STBTT__CSERR("unimplemented"); + } + } break; + + default: + if (b0 != 255 && b0 != 28 && b0 < 32) + return STBTT__CSERR("reserved operator"); + + // push immediate + if (b0 == 255) { + f = (float)(stbtt_int32)stbtt__buf_get32(&b) / 0x10000; + } else { + stbtt__buf_skip(&b, -1); + f = (float)(stbtt_int16)stbtt__cff_int(&b); + } + if (sp >= 48) return STBTT__CSERR("push stack overflow"); + s[sp++] = f; + clear_stack = 0; + break; + } + if (clear_stack) sp = 0; + } + return STBTT__CSERR("no endchar"); + +#undef STBTT__CSERR +} + +static int stbtt__GetGlyphShapeT2(const stbtt_fontinfo *info, int glyph_index, stbtt_vertex **pvertices) +{ + // runs the charstring twice, once to count and once to output (to avoid realloc) + stbtt__csctx count_ctx = STBTT__CSCTX_INIT(1); + stbtt__csctx output_ctx = STBTT__CSCTX_INIT(0); + if (stbtt__run_charstring(info, glyph_index, &count_ctx)) { + *pvertices = (stbtt_vertex*)STBTT_malloc(count_ctx.num_vertices*sizeof(stbtt_vertex), info->userdata); + output_ctx.pvertices = *pvertices; + if (stbtt__run_charstring(info, glyph_index, &output_ctx)) { + STBTT_assert(output_ctx.num_vertices == count_ctx.num_vertices); + return output_ctx.num_vertices; + } + } + *pvertices = NULL; + return 0; +} + +static int stbtt__GetGlyphInfoT2(const stbtt_fontinfo *info, int glyph_index, int *x0, int *y0, int *x1, int *y1) +{ + stbtt__csctx c = STBTT__CSCTX_INIT(1); + int r = stbtt__run_charstring(info, glyph_index, &c); + if (x0) *x0 = r ? c.min_x : 0; + if (y0) *y0 = r ? c.min_y : 0; + if (x1) *x1 = r ? c.max_x : 0; + if (y1) *y1 = r ? c.max_y : 0; + return r ? c.num_vertices : 0; +} + +STBTT_DEF int stbtt_GetGlyphShape(const stbtt_fontinfo *info, int glyph_index, stbtt_vertex **pvertices) +{ + if (!info->cff.size) + return stbtt__GetGlyphShapeTT(info, glyph_index, pvertices); + else + return stbtt__GetGlyphShapeT2(info, glyph_index, pvertices); +} + +STBTT_DEF void stbtt_GetGlyphHMetrics(const stbtt_fontinfo *info, int glyph_index, int *advanceWidth, int *leftSideBearing) +{ + stbtt_uint16 numOfLongHorMetrics = ttUSHORT(info->data+info->hhea + 34); + if (glyph_index < numOfLongHorMetrics) { + if (advanceWidth) *advanceWidth = ttSHORT(info->data + info->hmtx + 4*glyph_index); + if (leftSideBearing) *leftSideBearing = ttSHORT(info->data + info->hmtx + 4*glyph_index + 2); + } else { + if (advanceWidth) *advanceWidth = ttSHORT(info->data + info->hmtx + 4*(numOfLongHorMetrics-1)); + if (leftSideBearing) *leftSideBearing = ttSHORT(info->data + info->hmtx + 4*numOfLongHorMetrics + 2*(glyph_index - numOfLongHorMetrics)); + } +} + +STBTT_DEF int stbtt_GetKerningTableLength(const stbtt_fontinfo *info) +{ + stbtt_uint8 *data = info->data + info->kern; + + // we only look at the first table. it must be 'horizontal' and format 0. + if (!info->kern) + return 0; + if (ttUSHORT(data+2) < 1) // number of tables, need at least 1 + return 0; + if (ttUSHORT(data+8) != 1) // horizontal flag must be set in format + return 0; + + return ttUSHORT(data+10); +} + +STBTT_DEF int stbtt_GetKerningTable(const stbtt_fontinfo *info, stbtt_kerningentry* table, int table_length) +{ + stbtt_uint8 *data = info->data + info->kern; + int k, length; + + // we only look at the first table. it must be 'horizontal' and format 0. + if (!info->kern) + return 0; + if (ttUSHORT(data+2) < 1) // number of tables, need at least 1 + return 0; + if (ttUSHORT(data+8) != 1) // horizontal flag must be set in format + return 0; + + length = ttUSHORT(data+10); + if (table_length < length) + length = table_length; + + for (k = 0; k < length; k++) + { + table[k].glyph1 = ttUSHORT(data+18+(k*6)); + table[k].glyph2 = ttUSHORT(data+20+(k*6)); + table[k].advance = ttSHORT(data+22+(k*6)); + } + + return length; +} + +static int stbtt__GetGlyphKernInfoAdvance(const stbtt_fontinfo *info, int glyph1, int glyph2) +{ + stbtt_uint8 *data = info->data + info->kern; + stbtt_uint32 needle, straw; + int l, r, m; + + // we only look at the first table. it must be 'horizontal' and format 0. + if (!info->kern) + return 0; + if (ttUSHORT(data+2) < 1) // number of tables, need at least 1 + return 0; + if (ttUSHORT(data+8) != 1) // horizontal flag must be set in format + return 0; + + l = 0; + r = ttUSHORT(data+10) - 1; + needle = glyph1 << 16 | glyph2; + while (l <= r) { + m = (l + r) >> 1; + straw = ttULONG(data+18+(m*6)); // note: unaligned read + if (needle < straw) + r = m - 1; + else if (needle > straw) + l = m + 1; + else + return ttSHORT(data+22+(m*6)); + } + return 0; +} + +static stbtt_int32 stbtt__GetCoverageIndex(stbtt_uint8 *coverageTable, int glyph) +{ + stbtt_uint16 coverageFormat = ttUSHORT(coverageTable); + switch (coverageFormat) { + case 1: { + stbtt_uint16 glyphCount = ttUSHORT(coverageTable + 2); + + // Binary search. + stbtt_int32 l=0, r=glyphCount-1, m; + int straw, needle=glyph; + while (l <= r) { + stbtt_uint8 *glyphArray = coverageTable + 4; + stbtt_uint16 glyphID; + m = (l + r) >> 1; + glyphID = ttUSHORT(glyphArray + 2 * m); + straw = glyphID; + if (needle < straw) + r = m - 1; + else if (needle > straw) + l = m + 1; + else { + return m; + } + } + break; + } + + case 2: { + stbtt_uint16 rangeCount = ttUSHORT(coverageTable + 2); + stbtt_uint8 *rangeArray = coverageTable + 4; + + // Binary search. + stbtt_int32 l=0, r=rangeCount-1, m; + int strawStart, strawEnd, needle=glyph; + while (l <= r) { + stbtt_uint8 *rangeRecord; + m = (l + r) >> 1; + rangeRecord = rangeArray + 6 * m; + strawStart = ttUSHORT(rangeRecord); + strawEnd = ttUSHORT(rangeRecord + 2); + if (needle < strawStart) + r = m - 1; + else if (needle > strawEnd) + l = m + 1; + else { + stbtt_uint16 startCoverageIndex = ttUSHORT(rangeRecord + 4); + return startCoverageIndex + glyph - strawStart; + } + } + break; + } + + default: return -1; // unsupported + } + + return -1; +} + +static stbtt_int32 stbtt__GetGlyphClass(stbtt_uint8 *classDefTable, int glyph) +{ + stbtt_uint16 classDefFormat = ttUSHORT(classDefTable); + switch (classDefFormat) + { + case 1: { + stbtt_uint16 startGlyphID = ttUSHORT(classDefTable + 2); + stbtt_uint16 glyphCount = ttUSHORT(classDefTable + 4); + stbtt_uint8 *classDef1ValueArray = classDefTable + 6; + + if (glyph >= startGlyphID && glyph < startGlyphID + glyphCount) + return (stbtt_int32)ttUSHORT(classDef1ValueArray + 2 * (glyph - startGlyphID)); + break; + } + + case 2: { + stbtt_uint16 classRangeCount = ttUSHORT(classDefTable + 2); + stbtt_uint8 *classRangeRecords = classDefTable + 4; + + // Binary search. + stbtt_int32 l=0, r=classRangeCount-1, m; + int strawStart, strawEnd, needle=glyph; + while (l <= r) { + stbtt_uint8 *classRangeRecord; + m = (l + r) >> 1; + classRangeRecord = classRangeRecords + 6 * m; + strawStart = ttUSHORT(classRangeRecord); + strawEnd = ttUSHORT(classRangeRecord + 2); + if (needle < strawStart) + r = m - 1; + else if (needle > strawEnd) + l = m + 1; + else + return (stbtt_int32)ttUSHORT(classRangeRecord + 4); + } + break; + } + + default: + return -1; // Unsupported definition type, return an error. + } + + // "All glyphs not assigned to a class fall into class 0". (OpenType spec) + return 0; +} + +// Define to STBTT_assert(x) if you want to break on unimplemented formats. +#define STBTT_GPOS_TODO_assert(x) + +static stbtt_int32 stbtt__GetGlyphGPOSInfoAdvance(const stbtt_fontinfo *info, int glyph1, int glyph2) +{ + stbtt_uint16 lookupListOffset; + stbtt_uint8 *lookupList; + stbtt_uint16 lookupCount; + stbtt_uint8 *data; + stbtt_int32 i, sti; + + if (!info->gpos) return 0; + + data = info->data + info->gpos; + + if (ttUSHORT(data+0) != 1) return 0; // Major version 1 + if (ttUSHORT(data+2) != 0) return 0; // Minor version 0 + + lookupListOffset = ttUSHORT(data+8); + lookupList = data + lookupListOffset; + lookupCount = ttUSHORT(lookupList); + + for (i=0; i= pairSetCount) return 0; + + needle=glyph2; + r=pairValueCount-1; + l=0; + + // Binary search. + while (l <= r) { + stbtt_uint16 secondGlyph; + stbtt_uint8 *pairValue; + m = (l + r) >> 1; + pairValue = pairValueArray + (2 + valueRecordPairSizeInBytes) * m; + secondGlyph = ttUSHORT(pairValue); + straw = secondGlyph; + if (needle < straw) + r = m - 1; + else if (needle > straw) + l = m + 1; + else { + stbtt_int16 xAdvance = ttSHORT(pairValue + 2); + return xAdvance; + } + } + } else + return 0; + break; + } + + case 2: { + stbtt_uint16 valueFormat1 = ttUSHORT(table + 4); + stbtt_uint16 valueFormat2 = ttUSHORT(table + 6); + if (valueFormat1 == 4 && valueFormat2 == 0) { // Support more formats? + stbtt_uint16 classDef1Offset = ttUSHORT(table + 8); + stbtt_uint16 classDef2Offset = ttUSHORT(table + 10); + int glyph1class = stbtt__GetGlyphClass(table + classDef1Offset, glyph1); + int glyph2class = stbtt__GetGlyphClass(table + classDef2Offset, glyph2); + + stbtt_uint16 class1Count = ttUSHORT(table + 12); + stbtt_uint16 class2Count = ttUSHORT(table + 14); + stbtt_uint8 *class1Records, *class2Records; + stbtt_int16 xAdvance; + + if (glyph1class < 0 || glyph1class >= class1Count) return 0; // malformed + if (glyph2class < 0 || glyph2class >= class2Count) return 0; // malformed + + class1Records = table + 16; + class2Records = class1Records + 2 * (glyph1class * class2Count); + xAdvance = ttSHORT(class2Records + 2 * glyph2class); + return xAdvance; + } else + return 0; + break; + } + + default: + return 0; // Unsupported position format + } + } + } + + return 0; +} + +STBTT_DEF int stbtt_GetGlyphKernAdvance(const stbtt_fontinfo *info, int g1, int g2) +{ + int xAdvance = 0; + + if (info->gpos) + xAdvance += stbtt__GetGlyphGPOSInfoAdvance(info, g1, g2); + else if (info->kern) + xAdvance += stbtt__GetGlyphKernInfoAdvance(info, g1, g2); + + return xAdvance; +} + +STBTT_DEF int stbtt_GetCodepointKernAdvance(const stbtt_fontinfo *info, int ch1, int ch2) +{ + if (!info->kern && !info->gpos) // if no kerning table, don't waste time looking up both codepoint->glyphs + return 0; + return stbtt_GetGlyphKernAdvance(info, stbtt_FindGlyphIndex(info,ch1), stbtt_FindGlyphIndex(info,ch2)); +} + +STBTT_DEF void stbtt_GetCodepointHMetrics(const stbtt_fontinfo *info, int codepoint, int *advanceWidth, int *leftSideBearing) +{ + stbtt_GetGlyphHMetrics(info, stbtt_FindGlyphIndex(info,codepoint), advanceWidth, leftSideBearing); +} + +STBTT_DEF void stbtt_GetFontVMetrics(const stbtt_fontinfo *info, int *ascent, int *descent, int *lineGap) +{ + if (ascent ) *ascent = ttSHORT(info->data+info->hhea + 4); + if (descent) *descent = ttSHORT(info->data+info->hhea + 6); + if (lineGap) *lineGap = ttSHORT(info->data+info->hhea + 8); +} + +STBTT_DEF int stbtt_GetFontVMetricsOS2(const stbtt_fontinfo *info, int *typoAscent, int *typoDescent, int *typoLineGap) +{ + int tab = stbtt__find_table(info->data, info->fontstart, "OS/2"); + if (!tab) + return 0; + if (typoAscent ) *typoAscent = ttSHORT(info->data+tab + 68); + if (typoDescent) *typoDescent = ttSHORT(info->data+tab + 70); + if (typoLineGap) *typoLineGap = ttSHORT(info->data+tab + 72); + return 1; +} + +STBTT_DEF void stbtt_GetFontBoundingBox(const stbtt_fontinfo *info, int *x0, int *y0, int *x1, int *y1) +{ + *x0 = ttSHORT(info->data + info->head + 36); + *y0 = ttSHORT(info->data + info->head + 38); + *x1 = ttSHORT(info->data + info->head + 40); + *y1 = ttSHORT(info->data + info->head + 42); +} + +STBTT_DEF float stbtt_ScaleForPixelHeight(const stbtt_fontinfo *info, float height) +{ + int fheight = ttSHORT(info->data + info->hhea + 4) - ttSHORT(info->data + info->hhea + 6); + return (float) height / fheight; +} + +STBTT_DEF float stbtt_ScaleForMappingEmToPixels(const stbtt_fontinfo *info, float pixels) +{ + int unitsPerEm = ttUSHORT(info->data + info->head + 18); + return pixels / unitsPerEm; +} + +STBTT_DEF void stbtt_FreeShape(const stbtt_fontinfo *info, stbtt_vertex *v) +{ + STBTT_free(v, info->userdata); +} + +STBTT_DEF stbtt_uint8 *stbtt_FindSVGDoc(const stbtt_fontinfo *info, int gl) +{ + int i; + stbtt_uint8 *data = info->data; + stbtt_uint8 *svg_doc_list = data + stbtt__get_svg((stbtt_fontinfo *) info); + + int numEntries = ttUSHORT(svg_doc_list); + stbtt_uint8 *svg_docs = svg_doc_list + 2; + + for(i=0; i= ttUSHORT(svg_doc)) && (gl <= ttUSHORT(svg_doc + 2))) + return svg_doc; + } + return 0; +} + +STBTT_DEF int stbtt_GetGlyphSVG(const stbtt_fontinfo *info, int gl, const char **svg) +{ + stbtt_uint8 *data = info->data; + stbtt_uint8 *svg_doc; + + if (info->svg == 0) + return 0; + + svg_doc = stbtt_FindSVGDoc(info, gl); + if (svg_doc != NULL) { + *svg = (char *) data + info->svg + ttULONG(svg_doc + 4); + return ttULONG(svg_doc + 8); + } else { + return 0; + } +} + +STBTT_DEF int stbtt_GetCodepointSVG(const stbtt_fontinfo *info, int unicode_codepoint, const char **svg) +{ + return stbtt_GetGlyphSVG(info, stbtt_FindGlyphIndex(info, unicode_codepoint), svg); +} + +////////////////////////////////////////////////////////////////////////////// +// +// antialiasing software rasterizer +// + +STBTT_DEF void stbtt_GetGlyphBitmapBoxSubpixel(const stbtt_fontinfo *font, int glyph, float scale_x, float scale_y,float shift_x, float shift_y, int *ix0, int *iy0, int *ix1, int *iy1) +{ + int x0=0,y0=0,x1,y1; // =0 suppresses compiler warning + if (!stbtt_GetGlyphBox(font, glyph, &x0,&y0,&x1,&y1)) { + // e.g. space character + if (ix0) *ix0 = 0; + if (iy0) *iy0 = 0; + if (ix1) *ix1 = 0; + if (iy1) *iy1 = 0; + } else { + // move to integral bboxes (treating pixels as little squares, what pixels get touched)? + if (ix0) *ix0 = STBTT_ifloor( x0 * scale_x + shift_x); + if (iy0) *iy0 = STBTT_ifloor(-y1 * scale_y + shift_y); + if (ix1) *ix1 = STBTT_iceil ( x1 * scale_x + shift_x); + if (iy1) *iy1 = STBTT_iceil (-y0 * scale_y + shift_y); + } +} + +STBTT_DEF void stbtt_GetGlyphBitmapBox(const stbtt_fontinfo *font, int glyph, float scale_x, float scale_y, int *ix0, int *iy0, int *ix1, int *iy1) +{ + stbtt_GetGlyphBitmapBoxSubpixel(font, glyph, scale_x, scale_y,0.0f,0.0f, ix0, iy0, ix1, iy1); +} + +STBTT_DEF void stbtt_GetCodepointBitmapBoxSubpixel(const stbtt_fontinfo *font, int codepoint, float scale_x, float scale_y, float shift_x, float shift_y, int *ix0, int *iy0, int *ix1, int *iy1) +{ + stbtt_GetGlyphBitmapBoxSubpixel(font, stbtt_FindGlyphIndex(font,codepoint), scale_x, scale_y,shift_x,shift_y, ix0,iy0,ix1,iy1); +} + +STBTT_DEF void stbtt_GetCodepointBitmapBox(const stbtt_fontinfo *font, int codepoint, float scale_x, float scale_y, int *ix0, int *iy0, int *ix1, int *iy1) +{ + stbtt_GetCodepointBitmapBoxSubpixel(font, codepoint, scale_x, scale_y,0.0f,0.0f, ix0,iy0,ix1,iy1); +} + +////////////////////////////////////////////////////////////////////////////// +// +// Rasterizer + +typedef struct stbtt__hheap_chunk +{ + struct stbtt__hheap_chunk *next; +} stbtt__hheap_chunk; + +typedef struct stbtt__hheap +{ + struct stbtt__hheap_chunk *head; + void *first_free; + int num_remaining_in_head_chunk; +} stbtt__hheap; + +static void *stbtt__hheap_alloc(stbtt__hheap *hh, size_t size, void *userdata) +{ + if (hh->first_free) { + void *p = hh->first_free; + hh->first_free = * (void **) p; + return p; + } else { + if (hh->num_remaining_in_head_chunk == 0) { + int count = (size < 32 ? 2000 : size < 128 ? 800 : 100); + stbtt__hheap_chunk *c = (stbtt__hheap_chunk *) STBTT_malloc(sizeof(stbtt__hheap_chunk) + size * count, userdata); + if (c == NULL) + return NULL; + c->next = hh->head; + hh->head = c; + hh->num_remaining_in_head_chunk = count; + } + --hh->num_remaining_in_head_chunk; + return (char *) (hh->head) + sizeof(stbtt__hheap_chunk) + size * hh->num_remaining_in_head_chunk; + } +} + +static void stbtt__hheap_free(stbtt__hheap *hh, void *p) +{ + *(void **) p = hh->first_free; + hh->first_free = p; +} + +static void stbtt__hheap_cleanup(stbtt__hheap *hh, void *userdata) +{ + stbtt__hheap_chunk *c = hh->head; + while (c) { + stbtt__hheap_chunk *n = c->next; + STBTT_free(c, userdata); + c = n; + } +} + +typedef struct stbtt__edge { + float x0,y0, x1,y1; + int invert; +} stbtt__edge; + + +typedef struct stbtt__active_edge +{ + struct stbtt__active_edge *next; + #if STBTT_RASTERIZER_VERSION==1 + int x,dx; + float ey; + int direction; + #elif STBTT_RASTERIZER_VERSION==2 + float fx,fdx,fdy; + float direction; + float sy; + float ey; + #else + #error "Unrecognized value of STBTT_RASTERIZER_VERSION" + #endif +} stbtt__active_edge; + +#if STBTT_RASTERIZER_VERSION == 1 +#define STBTT_FIXSHIFT 10 +#define STBTT_FIX (1 << STBTT_FIXSHIFT) +#define STBTT_FIXMASK (STBTT_FIX-1) + +static stbtt__active_edge *stbtt__new_active(stbtt__hheap *hh, stbtt__edge *e, int off_x, float start_point, void *userdata) +{ + stbtt__active_edge *z = (stbtt__active_edge *) stbtt__hheap_alloc(hh, sizeof(*z), userdata); + float dxdy = (e->x1 - e->x0) / (e->y1 - e->y0); + STBTT_assert(z != NULL); + if (!z) return z; + + // round dx down to avoid overshooting + if (dxdy < 0) + z->dx = -STBTT_ifloor(STBTT_FIX * -dxdy); + else + z->dx = STBTT_ifloor(STBTT_FIX * dxdy); + + z->x = STBTT_ifloor(STBTT_FIX * e->x0 + z->dx * (start_point - e->y0)); // use z->dx so when we offset later it's by the same amount + z->x -= off_x * STBTT_FIX; + + z->ey = e->y1; + z->next = 0; + z->direction = e->invert ? 1 : -1; + return z; +} +#elif STBTT_RASTERIZER_VERSION == 2 +static stbtt__active_edge *stbtt__new_active(stbtt__hheap *hh, stbtt__edge *e, int off_x, float start_point, void *userdata) +{ + stbtt__active_edge *z = (stbtt__active_edge *) stbtt__hheap_alloc(hh, sizeof(*z), userdata); + float dxdy = (e->x1 - e->x0) / (e->y1 - e->y0); + STBTT_assert(z != NULL); + //STBTT_assert(e->y0 <= start_point); + if (!z) return z; + z->fdx = dxdy; + z->fdy = dxdy != 0.0f ? (1.0f/dxdy) : 0.0f; + z->fx = e->x0 + dxdy * (start_point - e->y0); + z->fx -= off_x; + z->direction = e->invert ? 1.0f : -1.0f; + z->sy = e->y0; + z->ey = e->y1; + z->next = 0; + return z; +} +#else +#error "Unrecognized value of STBTT_RASTERIZER_VERSION" +#endif + +#if STBTT_RASTERIZER_VERSION == 1 +// note: this routine clips fills that extend off the edges... ideally this +// wouldn't happen, but it could happen if the truetype glyph bounding boxes +// are wrong, or if the user supplies a too-small bitmap +static void stbtt__fill_active_edges(unsigned char *scanline, int len, stbtt__active_edge *e, int max_weight) +{ + // non-zero winding fill + int x0=0, w=0; + + while (e) { + if (w == 0) { + // if we're currently at zero, we need to record the edge start point + x0 = e->x; w += e->direction; + } else { + int x1 = e->x; w += e->direction; + // if we went to zero, we need to draw + if (w == 0) { + int i = x0 >> STBTT_FIXSHIFT; + int j = x1 >> STBTT_FIXSHIFT; + + if (i < len && j >= 0) { + if (i == j) { + // x0,x1 are the same pixel, so compute combined coverage + scanline[i] = scanline[i] + (stbtt_uint8) ((x1 - x0) * max_weight >> STBTT_FIXSHIFT); + } else { + if (i >= 0) // add antialiasing for x0 + scanline[i] = scanline[i] + (stbtt_uint8) (((STBTT_FIX - (x0 & STBTT_FIXMASK)) * max_weight) >> STBTT_FIXSHIFT); + else + i = -1; // clip + + if (j < len) // add antialiasing for x1 + scanline[j] = scanline[j] + (stbtt_uint8) (((x1 & STBTT_FIXMASK) * max_weight) >> STBTT_FIXSHIFT); + else + j = len; // clip + + for (++i; i < j; ++i) // fill pixels between x0 and x1 + scanline[i] = scanline[i] + (stbtt_uint8) max_weight; + } + } + } + } + + e = e->next; + } +} + +static void stbtt__rasterize_sorted_edges(stbtt__bitmap *result, stbtt__edge *e, int n, int vsubsample, int off_x, int off_y, void *userdata) +{ + stbtt__hheap hh = { 0, 0, 0 }; + stbtt__active_edge *active = NULL; + int y,j=0; + int max_weight = (255 / vsubsample); // weight per vertical scanline + int s; // vertical subsample index + unsigned char scanline_data[512], *scanline; + + if (result->w > 512) + scanline = (unsigned char *) STBTT_malloc(result->w, userdata); + else + scanline = scanline_data; + + y = off_y * vsubsample; + e[n].y0 = (off_y + result->h) * (float) vsubsample + 1; + + while (j < result->h) { + STBTT_memset(scanline, 0, result->w); + for (s=0; s < vsubsample; ++s) { + // find center of pixel for this scanline + float scan_y = y + 0.5f; + stbtt__active_edge **step = &active; + + // update all active edges; + // remove all active edges that terminate before the center of this scanline + while (*step) { + stbtt__active_edge * z = *step; + if (z->ey <= scan_y) { + *step = z->next; // delete from list + STBTT_assert(z->direction); + z->direction = 0; + stbtt__hheap_free(&hh, z); + } else { + z->x += z->dx; // advance to position for current scanline + step = &((*step)->next); // advance through list + } + } + + // resort the list if needed + for(;;) { + int changed=0; + step = &active; + while (*step && (*step)->next) { + if ((*step)->x > (*step)->next->x) { + stbtt__active_edge *t = *step; + stbtt__active_edge *q = t->next; + + t->next = q->next; + q->next = t; + *step = q; + changed = 1; + } + step = &(*step)->next; + } + if (!changed) break; + } + + // insert all edges that start before the center of this scanline -- omit ones that also end on this scanline + while (e->y0 <= scan_y) { + if (e->y1 > scan_y) { + stbtt__active_edge *z = stbtt__new_active(&hh, e, off_x, scan_y, userdata); + if (z != NULL) { + // find insertion point + if (active == NULL) + active = z; + else if (z->x < active->x) { + // insert at front + z->next = active; + active = z; + } else { + // find thing to insert AFTER + stbtt__active_edge *p = active; + while (p->next && p->next->x < z->x) + p = p->next; + // at this point, p->next->x is NOT < z->x + z->next = p->next; + p->next = z; + } + } + } + ++e; + } + + // now process all active edges in XOR fashion + if (active) + stbtt__fill_active_edges(scanline, result->w, active, max_weight); + + ++y; + } + STBTT_memcpy(result->pixels + j * result->stride, scanline, result->w); + ++j; + } + + stbtt__hheap_cleanup(&hh, userdata); + + if (scanline != scanline_data) + STBTT_free(scanline, userdata); +} + +#elif STBTT_RASTERIZER_VERSION == 2 + +// the edge passed in here does not cross the vertical line at x or the vertical line at x+1 +// (i.e. it has already been clipped to those) +static void stbtt__handle_clipped_edge(float *scanline, int x, stbtt__active_edge *e, float x0, float y0, float x1, float y1) +{ + if (y0 == y1) return; + STBTT_assert(y0 < y1); + STBTT_assert(e->sy <= e->ey); + if (y0 > e->ey) return; + if (y1 < e->sy) return; + if (y0 < e->sy) { + x0 += (x1-x0) * (e->sy - y0) / (y1-y0); + y0 = e->sy; + } + if (y1 > e->ey) { + x1 += (x1-x0) * (e->ey - y1) / (y1-y0); + y1 = e->ey; + } + + if (x0 == x) + STBTT_assert(x1 <= x+1); + else if (x0 == x+1) + STBTT_assert(x1 >= x); + else if (x0 <= x) + STBTT_assert(x1 <= x); + else if (x0 >= x+1) + STBTT_assert(x1 >= x+1); + else + STBTT_assert(x1 >= x && x1 <= x+1); + + if (x0 <= x && x1 <= x) + scanline[x] += e->direction * (y1-y0); + else if (x0 >= x+1 && x1 >= x+1) + ; + else { + STBTT_assert(x0 >= x && x0 <= x+1 && x1 >= x && x1 <= x+1); + scanline[x] += e->direction * (y1-y0) * (1-((x0-x)+(x1-x))/2); // coverage = 1 - average x position + } +} + +static float stbtt__sized_trapezoid_area(float height, float top_width, float bottom_width) +{ + STBTT_assert(top_width >= 0); + STBTT_assert(bottom_width >= 0); + return (top_width + bottom_width) / 2.0f * height; +} + +static float stbtt__position_trapezoid_area(float height, float tx0, float tx1, float bx0, float bx1) +{ + return stbtt__sized_trapezoid_area(height, tx1 - tx0, bx1 - bx0); +} + +static float stbtt__sized_triangle_area(float height, float width) +{ + return height * width / 2; +} + +static void stbtt__fill_active_edges_new(float *scanline, float *scanline_fill, int len, stbtt__active_edge *e, float y_top) +{ + float y_bottom = y_top+1; + + while (e) { + // brute force every pixel + + // compute intersection points with top & bottom + STBTT_assert(e->ey >= y_top); + + if (e->fdx == 0) { + float x0 = e->fx; + if (x0 < len) { + if (x0 >= 0) { + stbtt__handle_clipped_edge(scanline,(int) x0,e, x0,y_top, x0,y_bottom); + stbtt__handle_clipped_edge(scanline_fill-1,(int) x0+1,e, x0,y_top, x0,y_bottom); + } else { + stbtt__handle_clipped_edge(scanline_fill-1,0,e, x0,y_top, x0,y_bottom); + } + } + } else { + float x0 = e->fx; + float dx = e->fdx; + float xb = x0 + dx; + float x_top, x_bottom; + float sy0,sy1; + float dy = e->fdy; + STBTT_assert(e->sy <= y_bottom && e->ey >= y_top); + + // compute endpoints of line segment clipped to this scanline (if the + // line segment starts on this scanline. x0 is the intersection of the + // line with y_top, but that may be off the line segment. + if (e->sy > y_top) { + x_top = x0 + dx * (e->sy - y_top); + sy0 = e->sy; + } else { + x_top = x0; + sy0 = y_top; + } + if (e->ey < y_bottom) { + x_bottom = x0 + dx * (e->ey - y_top); + sy1 = e->ey; + } else { + x_bottom = xb; + sy1 = y_bottom; + } + + if (x_top >= 0 && x_bottom >= 0 && x_top < len && x_bottom < len) { + // from here on, we don't have to range check x values + + if ((int) x_top == (int) x_bottom) { + float height; + // simple case, only spans one pixel + int x = (int) x_top; + height = (sy1 - sy0) * e->direction; + STBTT_assert(x >= 0 && x < len); + scanline[x] += stbtt__position_trapezoid_area(height, x_top, x+1.0f, x_bottom, x+1.0f); + scanline_fill[x] += height; // everything right of this pixel is filled + } else { + int x,x1,x2; + float y_crossing, y_final, step, sign, area; + // covers 2+ pixels + if (x_top > x_bottom) { + // flip scanline vertically; signed area is the same + float t; + sy0 = y_bottom - (sy0 - y_top); + sy1 = y_bottom - (sy1 - y_top); + t = sy0, sy0 = sy1, sy1 = t; + t = x_bottom, x_bottom = x_top, x_top = t; + dx = -dx; + dy = -dy; + t = x0, x0 = xb, xb = t; + } + STBTT_assert(dy >= 0); + STBTT_assert(dx >= 0); + + x1 = (int) x_top; + x2 = (int) x_bottom; + // compute intersection with y axis at x1+1 + y_crossing = y_top + dy * (x1+1 - x0); + + // compute intersection with y axis at x2 + y_final = y_top + dy * (x2 - x0); + + // x1 x_top x2 x_bottom + // y_top +------|-----+------------+------------+--------|---+------------+ + // | | | | | | + // | | | | | | + // sy0 | Txxxxx|............|............|............|............| + // y_crossing | *xxxxx.......|............|............|............| + // | | xxxxx..|............|............|............| + // | | /- xx*xxxx........|............|............| + // | | dy < | xxxxxx..|............|............| + // y_final | | \- | xx*xxx.........|............| + // sy1 | | | | xxxxxB...|............| + // | | | | | | + // | | | | | | + // y_bottom +------------+------------+------------+------------+------------+ + // + // goal is to measure the area covered by '.' in each pixel + + // if x2 is right at the right edge of x1, y_crossing can blow up, github #1057 + // @TODO: maybe test against sy1 rather than y_bottom? + if (y_crossing > y_bottom) + y_crossing = y_bottom; + + sign = e->direction; + + // area of the rectangle covered from sy0..y_crossing + area = sign * (y_crossing-sy0); + + // area of the triangle (x_top,sy0), (x1+1,sy0), (x1+1,y_crossing) + scanline[x1] += stbtt__sized_triangle_area(area, x1+1 - x_top); + + // check if final y_crossing is blown up; no test case for this + if (y_final > y_bottom) { + y_final = y_bottom; + dy = (y_final - y_crossing ) / (x2 - (x1+1)); // if denom=0, y_final = y_crossing, so y_final <= y_bottom + } + + // in second pixel, area covered by line segment found in first pixel + // is always a rectangle 1 wide * the height of that line segment; this + // is exactly what the variable 'area' stores. it also gets a contribution + // from the line segment within it. the THIRD pixel will get the first + // pixel's rectangle contribution, the second pixel's rectangle contribution, + // and its own contribution. the 'own contribution' is the same in every pixel except + // the leftmost and rightmost, a trapezoid that slides down in each pixel. + // the second pixel's contribution to the third pixel will be the + // rectangle 1 wide times the height change in the second pixel, which is dy. + + step = sign * dy * 1; // dy is dy/dx, change in y for every 1 change in x, + // which multiplied by 1-pixel-width is how much pixel area changes for each step in x + // so the area advances by 'step' every time + + for (x = x1+1; x < x2; ++x) { + scanline[x] += area + step/2; // area of trapezoid is 1*step/2 + area += step; + } + STBTT_assert(STBTT_fabs(area) <= 1.01f); // accumulated error from area += step unless we round step down + STBTT_assert(sy1 > y_final-0.01f); + + // area covered in the last pixel is the rectangle from all the pixels to the left, + // plus the trapezoid filled by the line segment in this pixel all the way to the right edge + scanline[x2] += area + sign * stbtt__position_trapezoid_area(sy1-y_final, (float) x2, x2+1.0f, x_bottom, x2+1.0f); + + // the rest of the line is filled based on the total height of the line segment in this pixel + scanline_fill[x2] += sign * (sy1-sy0); + } + } else { + // if edge goes outside of box we're drawing, we require + // clipping logic. since this does not match the intended use + // of this library, we use a different, very slow brute + // force implementation + // note though that this does happen some of the time because + // x_top and x_bottom can be extrapolated at the top & bottom of + // the shape and actually lie outside the bounding box + int x; + for (x=0; x < len; ++x) { + // cases: + // + // there can be up to two intersections with the pixel. any intersection + // with left or right edges can be handled by splitting into two (or three) + // regions. intersections with top & bottom do not necessitate case-wise logic. + // + // the old way of doing this found the intersections with the left & right edges, + // then used some simple logic to produce up to three segments in sorted order + // from top-to-bottom. however, this had a problem: if an x edge was epsilon + // across the x border, then the corresponding y position might not be distinct + // from the other y segment, and it might ignored as an empty segment. to avoid + // that, we need to explicitly produce segments based on x positions. + + // rename variables to clearly-defined pairs + float y0 = y_top; + float x1 = (float) (x); + float x2 = (float) (x+1); + float x3 = xb; + float y3 = y_bottom; + + // x = e->x + e->dx * (y-y_top) + // (y-y_top) = (x - e->x) / e->dx + // y = (x - e->x) / e->dx + y_top + float y1 = (x - x0) / dx + y_top; + float y2 = (x+1 - x0) / dx + y_top; + + if (x0 < x1 && x3 > x2) { // three segments descending down-right + stbtt__handle_clipped_edge(scanline,x,e, x0,y0, x1,y1); + stbtt__handle_clipped_edge(scanline,x,e, x1,y1, x2,y2); + stbtt__handle_clipped_edge(scanline,x,e, x2,y2, x3,y3); + } else if (x3 < x1 && x0 > x2) { // three segments descending down-left + stbtt__handle_clipped_edge(scanline,x,e, x0,y0, x2,y2); + stbtt__handle_clipped_edge(scanline,x,e, x2,y2, x1,y1); + stbtt__handle_clipped_edge(scanline,x,e, x1,y1, x3,y3); + } else if (x0 < x1 && x3 > x1) { // two segments across x, down-right + stbtt__handle_clipped_edge(scanline,x,e, x0,y0, x1,y1); + stbtt__handle_clipped_edge(scanline,x,e, x1,y1, x3,y3); + } else if (x3 < x1 && x0 > x1) { // two segments across x, down-left + stbtt__handle_clipped_edge(scanline,x,e, x0,y0, x1,y1); + stbtt__handle_clipped_edge(scanline,x,e, x1,y1, x3,y3); + } else if (x0 < x2 && x3 > x2) { // two segments across x+1, down-right + stbtt__handle_clipped_edge(scanline,x,e, x0,y0, x2,y2); + stbtt__handle_clipped_edge(scanline,x,e, x2,y2, x3,y3); + } else if (x3 < x2 && x0 > x2) { // two segments across x+1, down-left + stbtt__handle_clipped_edge(scanline,x,e, x0,y0, x2,y2); + stbtt__handle_clipped_edge(scanline,x,e, x2,y2, x3,y3); + } else { // one segment + stbtt__handle_clipped_edge(scanline,x,e, x0,y0, x3,y3); + } + } + } + } + e = e->next; + } +} + +// directly AA rasterize edges w/o supersampling +static void stbtt__rasterize_sorted_edges(stbtt__bitmap *result, stbtt__edge *e, int n, int vsubsample, int off_x, int off_y, void *userdata) +{ + stbtt__hheap hh = { 0, 0, 0 }; + stbtt__active_edge *active = NULL; + int y,j=0, i; + float scanline_data[129], *scanline, *scanline2; + + STBTT__NOTUSED(vsubsample); + + if (result->w > 64) + scanline = (float *) STBTT_malloc((result->w*2+1) * sizeof(float), userdata); + else + scanline = scanline_data; + + scanline2 = scanline + result->w; + + y = off_y; + e[n].y0 = (float) (off_y + result->h) + 1; + + while (j < result->h) { + // find center of pixel for this scanline + float scan_y_top = y + 0.0f; + float scan_y_bottom = y + 1.0f; + stbtt__active_edge **step = &active; + + STBTT_memset(scanline , 0, result->w*sizeof(scanline[0])); + STBTT_memset(scanline2, 0, (result->w+1)*sizeof(scanline[0])); + + // update all active edges; + // remove all active edges that terminate before the top of this scanline + while (*step) { + stbtt__active_edge * z = *step; + if (z->ey <= scan_y_top) { + *step = z->next; // delete from list + STBTT_assert(z->direction); + z->direction = 0; + stbtt__hheap_free(&hh, z); + } else { + step = &((*step)->next); // advance through list + } + } + + // insert all edges that start before the bottom of this scanline + while (e->y0 <= scan_y_bottom) { + if (e->y0 != e->y1) { + stbtt__active_edge *z = stbtt__new_active(&hh, e, off_x, scan_y_top, userdata); + if (z != NULL) { + if (j == 0 && off_y != 0) { + if (z->ey < scan_y_top) { + // this can happen due to subpixel positioning and some kind of fp rounding error i think + z->ey = scan_y_top; + } + } + STBTT_assert(z->ey >= scan_y_top); // if we get really unlucky a tiny bit of an edge can be out of bounds + // insert at front + z->next = active; + active = z; + } + } + ++e; + } + + // now process all active edges + if (active) + stbtt__fill_active_edges_new(scanline, scanline2+1, result->w, active, scan_y_top); + + { + float sum = 0; + for (i=0; i < result->w; ++i) { + float k; + int m; + sum += scanline2[i]; + k = scanline[i] + sum; + k = (float) STBTT_fabs(k)*255 + 0.5f; + m = (int) k; + if (m > 255) m = 255; + result->pixels[j*result->stride + i] = (unsigned char) m; + } + } + // advance all the edges + step = &active; + while (*step) { + stbtt__active_edge *z = *step; + z->fx += z->fdx; // advance to position for current scanline + step = &((*step)->next); // advance through list + } + + ++y; + ++j; + } + + stbtt__hheap_cleanup(&hh, userdata); + + if (scanline != scanline_data) + STBTT_free(scanline, userdata); +} +#else +#error "Unrecognized value of STBTT_RASTERIZER_VERSION" +#endif + +#define STBTT__COMPARE(a,b) ((a)->y0 < (b)->y0) + +static void stbtt__sort_edges_ins_sort(stbtt__edge *p, int n) +{ + int i,j; + for (i=1; i < n; ++i) { + stbtt__edge t = p[i], *a = &t; + j = i; + while (j > 0) { + stbtt__edge *b = &p[j-1]; + int c = STBTT__COMPARE(a,b); + if (!c) break; + p[j] = p[j-1]; + --j; + } + if (i != j) + p[j] = t; + } +} + +static void stbtt__sort_edges_quicksort(stbtt__edge *p, int n) +{ + /* threshold for transitioning to insertion sort */ + while (n > 12) { + stbtt__edge t; + int c01,c12,c,m,i,j; + + /* compute median of three */ + m = n >> 1; + c01 = STBTT__COMPARE(&p[0],&p[m]); + c12 = STBTT__COMPARE(&p[m],&p[n-1]); + /* if 0 >= mid >= end, or 0 < mid < end, then use mid */ + if (c01 != c12) { + /* otherwise, we'll need to swap something else to middle */ + int z; + c = STBTT__COMPARE(&p[0],&p[n-1]); + /* 0>mid && midn => n; 0 0 */ + /* 0n: 0>n => 0; 0 n */ + z = (c == c12) ? 0 : n-1; + t = p[z]; + p[z] = p[m]; + p[m] = t; + } + /* now p[m] is the median-of-three */ + /* swap it to the beginning so it won't move around */ + t = p[0]; + p[0] = p[m]; + p[m] = t; + + /* partition loop */ + i=1; + j=n-1; + for(;;) { + /* handling of equality is crucial here */ + /* for sentinels & efficiency with duplicates */ + for (;;++i) { + if (!STBTT__COMPARE(&p[i], &p[0])) break; + } + for (;;--j) { + if (!STBTT__COMPARE(&p[0], &p[j])) break; + } + /* make sure we haven't crossed */ + if (i >= j) break; + t = p[i]; + p[i] = p[j]; + p[j] = t; + + ++i; + --j; + } + /* recurse on smaller side, iterate on larger */ + if (j < (n-i)) { + stbtt__sort_edges_quicksort(p,j); + p = p+i; + n = n-i; + } else { + stbtt__sort_edges_quicksort(p+i, n-i); + n = j; + } + } +} + +static void stbtt__sort_edges(stbtt__edge *p, int n) +{ + stbtt__sort_edges_quicksort(p, n); + stbtt__sort_edges_ins_sort(p, n); +} + +typedef struct +{ + float x,y; +} stbtt__point; + +static void stbtt__rasterize(stbtt__bitmap *result, stbtt__point *pts, int *wcount, int windings, float scale_x, float scale_y, float shift_x, float shift_y, int off_x, int off_y, int invert, void *userdata) +{ + float y_scale_inv = invert ? -scale_y : scale_y; + stbtt__edge *e; + int n,i,j,k,m; +#if STBTT_RASTERIZER_VERSION == 1 + int vsubsample = result->h < 8 ? 15 : 5; +#elif STBTT_RASTERIZER_VERSION == 2 + int vsubsample = 1; +#else + #error "Unrecognized value of STBTT_RASTERIZER_VERSION" +#endif + // vsubsample should divide 255 evenly; otherwise we won't reach full opacity + + // now we have to blow out the windings into explicit edge lists + n = 0; + for (i=0; i < windings; ++i) + n += wcount[i]; + + e = (stbtt__edge *) STBTT_malloc(sizeof(*e) * (n+1), userdata); // add an extra one as a sentinel + if (e == 0) return; + n = 0; + + m=0; + for (i=0; i < windings; ++i) { + stbtt__point *p = pts + m; + m += wcount[i]; + j = wcount[i]-1; + for (k=0; k < wcount[i]; j=k++) { + int a=k,b=j; + // skip the edge if horizontal + if (p[j].y == p[k].y) + continue; + // add edge from j to k to the list + e[n].invert = 0; + if (invert ? p[j].y > p[k].y : p[j].y < p[k].y) { + e[n].invert = 1; + a=j,b=k; + } + e[n].x0 = p[a].x * scale_x + shift_x; + e[n].y0 = (p[a].y * y_scale_inv + shift_y) * vsubsample; + e[n].x1 = p[b].x * scale_x + shift_x; + e[n].y1 = (p[b].y * y_scale_inv + shift_y) * vsubsample; + ++n; + } + } + + // now sort the edges by their highest point (should snap to integer, and then by x) + //STBTT_sort(e, n, sizeof(e[0]), stbtt__edge_compare); + stbtt__sort_edges(e, n); + + // now, traverse the scanlines and find the intersections on each scanline, use xor winding rule + stbtt__rasterize_sorted_edges(result, e, n, vsubsample, off_x, off_y, userdata); + + STBTT_free(e, userdata); +} + +static void stbtt__add_point(stbtt__point *points, int n, float x, float y) +{ + if (!points) return; // during first pass, it's unallocated + points[n].x = x; + points[n].y = y; +} + +// tessellate until threshold p is happy... @TODO warped to compensate for non-linear stretching +static int stbtt__tesselate_curve(stbtt__point *points, int *num_points, float x0, float y0, float x1, float y1, float x2, float y2, float objspace_flatness_squared, int n) +{ + // midpoint + float mx = (x0 + 2*x1 + x2)/4; + float my = (y0 + 2*y1 + y2)/4; + // versus directly drawn line + float dx = (x0+x2)/2 - mx; + float dy = (y0+y2)/2 - my; + if (n > 16) // 65536 segments on one curve better be enough! + return 1; + if (dx*dx+dy*dy > objspace_flatness_squared) { // half-pixel error allowed... need to be smaller if AA + stbtt__tesselate_curve(points, num_points, x0,y0, (x0+x1)/2.0f,(y0+y1)/2.0f, mx,my, objspace_flatness_squared,n+1); + stbtt__tesselate_curve(points, num_points, mx,my, (x1+x2)/2.0f,(y1+y2)/2.0f, x2,y2, objspace_flatness_squared,n+1); + } else { + stbtt__add_point(points, *num_points,x2,y2); + *num_points = *num_points+1; + } + return 1; +} + +static void stbtt__tesselate_cubic(stbtt__point *points, int *num_points, float x0, float y0, float x1, float y1, float x2, float y2, float x3, float y3, float objspace_flatness_squared, int n) +{ + // @TODO this "flatness" calculation is just made-up nonsense that seems to work well enough + float dx0 = x1-x0; + float dy0 = y1-y0; + float dx1 = x2-x1; + float dy1 = y2-y1; + float dx2 = x3-x2; + float dy2 = y3-y2; + float dx = x3-x0; + float dy = y3-y0; + float longlen = (float) (STBTT_sqrt(dx0*dx0+dy0*dy0)+STBTT_sqrt(dx1*dx1+dy1*dy1)+STBTT_sqrt(dx2*dx2+dy2*dy2)); + float shortlen = (float) STBTT_sqrt(dx*dx+dy*dy); + float flatness_squared = longlen*longlen-shortlen*shortlen; + + if (n > 16) // 65536 segments on one curve better be enough! + return; + + if (flatness_squared > objspace_flatness_squared) { + float x01 = (x0+x1)/2; + float y01 = (y0+y1)/2; + float x12 = (x1+x2)/2; + float y12 = (y1+y2)/2; + float x23 = (x2+x3)/2; + float y23 = (y2+y3)/2; + + float xa = (x01+x12)/2; + float ya = (y01+y12)/2; + float xb = (x12+x23)/2; + float yb = (y12+y23)/2; + + float mx = (xa+xb)/2; + float my = (ya+yb)/2; + + stbtt__tesselate_cubic(points, num_points, x0,y0, x01,y01, xa,ya, mx,my, objspace_flatness_squared,n+1); + stbtt__tesselate_cubic(points, num_points, mx,my, xb,yb, x23,y23, x3,y3, objspace_flatness_squared,n+1); + } else { + stbtt__add_point(points, *num_points,x3,y3); + *num_points = *num_points+1; + } +} + +// returns number of contours +static stbtt__point *stbtt_FlattenCurves(stbtt_vertex *vertices, int num_verts, float objspace_flatness, int **contour_lengths, int *num_contours, void *userdata) +{ + stbtt__point *points=0; + int num_points=0; + + float objspace_flatness_squared = objspace_flatness * objspace_flatness; + int i,n=0,start=0, pass; + + // count how many "moves" there are to get the contour count + for (i=0; i < num_verts; ++i) + if (vertices[i].type == STBTT_vmove) + ++n; + + *num_contours = n; + if (n == 0) return 0; + + *contour_lengths = (int *) STBTT_malloc(sizeof(**contour_lengths) * n, userdata); + + if (*contour_lengths == 0) { + *num_contours = 0; + return 0; + } + + // make two passes through the points so we don't need to realloc + for (pass=0; pass < 2; ++pass) { + float x=0,y=0; + if (pass == 1) { + points = (stbtt__point *) STBTT_malloc(num_points * sizeof(points[0]), userdata); + if (points == NULL) goto error; + } + num_points = 0; + n= -1; + for (i=0; i < num_verts; ++i) { + switch (vertices[i].type) { + case STBTT_vmove: + // start the next contour + if (n >= 0) + (*contour_lengths)[n] = num_points - start; + ++n; + start = num_points; + + x = vertices[i].x, y = vertices[i].y; + stbtt__add_point(points, num_points++, x,y); + break; + case STBTT_vline: + x = vertices[i].x, y = vertices[i].y; + stbtt__add_point(points, num_points++, x, y); + break; + case STBTT_vcurve: + stbtt__tesselate_curve(points, &num_points, x,y, + vertices[i].cx, vertices[i].cy, + vertices[i].x, vertices[i].y, + objspace_flatness_squared, 0); + x = vertices[i].x, y = vertices[i].y; + break; + case STBTT_vcubic: + stbtt__tesselate_cubic(points, &num_points, x,y, + vertices[i].cx, vertices[i].cy, + vertices[i].cx1, vertices[i].cy1, + vertices[i].x, vertices[i].y, + objspace_flatness_squared, 0); + x = vertices[i].x, y = vertices[i].y; + break; + } + } + (*contour_lengths)[n] = num_points - start; + } + + return points; +error: + STBTT_free(points, userdata); + STBTT_free(*contour_lengths, userdata); + *contour_lengths = 0; + *num_contours = 0; + return NULL; +} + +STBTT_DEF void stbtt_Rasterize(stbtt__bitmap *result, float flatness_in_pixels, stbtt_vertex *vertices, int num_verts, float scale_x, float scale_y, float shift_x, float shift_y, int x_off, int y_off, int invert, void *userdata) +{ + float scale = scale_x > scale_y ? scale_y : scale_x; + int winding_count = 0; + int *winding_lengths = NULL; + stbtt__point *windings = stbtt_FlattenCurves(vertices, num_verts, flatness_in_pixels / scale, &winding_lengths, &winding_count, userdata); + if (windings) { + stbtt__rasterize(result, windings, winding_lengths, winding_count, scale_x, scale_y, shift_x, shift_y, x_off, y_off, invert, userdata); + STBTT_free(winding_lengths, userdata); + STBTT_free(windings, userdata); + } +} + +STBTT_DEF void stbtt_FreeBitmap(unsigned char *bitmap, void *userdata) +{ + STBTT_free(bitmap, userdata); +} + +STBTT_DEF unsigned char *stbtt_GetGlyphBitmapSubpixel(const stbtt_fontinfo *info, float scale_x, float scale_y, float shift_x, float shift_y, int glyph, int *width, int *height, int *xoff, int *yoff) +{ + int ix0,iy0,ix1,iy1; + stbtt__bitmap gbm; + stbtt_vertex *vertices; + int num_verts = stbtt_GetGlyphShape(info, glyph, &vertices); + + if (scale_x == 0) scale_x = scale_y; + if (scale_y == 0) { + if (scale_x == 0) { + STBTT_free(vertices, info->userdata); + return NULL; + } + scale_y = scale_x; + } + + stbtt_GetGlyphBitmapBoxSubpixel(info, glyph, scale_x, scale_y, shift_x, shift_y, &ix0,&iy0,&ix1,&iy1); + + // now we get the size + gbm.w = (ix1 - ix0); + gbm.h = (iy1 - iy0); + gbm.pixels = NULL; // in case we error + + if (width ) *width = gbm.w; + if (height) *height = gbm.h; + if (xoff ) *xoff = ix0; + if (yoff ) *yoff = iy0; + + if (gbm.w && gbm.h) { + gbm.pixels = (unsigned char *) STBTT_malloc(gbm.w * gbm.h, info->userdata); + if (gbm.pixels) { + gbm.stride = gbm.w; + + stbtt_Rasterize(&gbm, 0.35f, vertices, num_verts, scale_x, scale_y, shift_x, shift_y, ix0, iy0, 1, info->userdata); + } + } + STBTT_free(vertices, info->userdata); + return gbm.pixels; +} + +STBTT_DEF unsigned char *stbtt_GetGlyphBitmap(const stbtt_fontinfo *info, float scale_x, float scale_y, int glyph, int *width, int *height, int *xoff, int *yoff) +{ + return stbtt_GetGlyphBitmapSubpixel(info, scale_x, scale_y, 0.0f, 0.0f, glyph, width, height, xoff, yoff); +} + +STBTT_DEF void stbtt_MakeGlyphBitmapSubpixel(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int glyph) +{ + int ix0,iy0; + stbtt_vertex *vertices; + int num_verts = stbtt_GetGlyphShape(info, glyph, &vertices); + stbtt__bitmap gbm; + + stbtt_GetGlyphBitmapBoxSubpixel(info, glyph, scale_x, scale_y, shift_x, shift_y, &ix0,&iy0,0,0); + gbm.pixels = output; + gbm.w = out_w; + gbm.h = out_h; + gbm.stride = out_stride; + + if (gbm.w && gbm.h) + stbtt_Rasterize(&gbm, 0.35f, vertices, num_verts, scale_x, scale_y, shift_x, shift_y, ix0,iy0, 1, info->userdata); + + STBTT_free(vertices, info->userdata); +} + +STBTT_DEF void stbtt_MakeGlyphBitmap(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, int glyph) +{ + stbtt_MakeGlyphBitmapSubpixel(info, output, out_w, out_h, out_stride, scale_x, scale_y, 0.0f,0.0f, glyph); +} + +STBTT_DEF unsigned char *stbtt_GetCodepointBitmapSubpixel(const stbtt_fontinfo *info, float scale_x, float scale_y, float shift_x, float shift_y, int codepoint, int *width, int *height, int *xoff, int *yoff) +{ + return stbtt_GetGlyphBitmapSubpixel(info, scale_x, scale_y,shift_x,shift_y, stbtt_FindGlyphIndex(info,codepoint), width,height,xoff,yoff); +} + +STBTT_DEF void stbtt_MakeCodepointBitmapSubpixelPrefilter(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int oversample_x, int oversample_y, float *sub_x, float *sub_y, int codepoint) +{ + stbtt_MakeGlyphBitmapSubpixelPrefilter(info, output, out_w, out_h, out_stride, scale_x, scale_y, shift_x, shift_y, oversample_x, oversample_y, sub_x, sub_y, stbtt_FindGlyphIndex(info,codepoint)); +} + +STBTT_DEF void stbtt_MakeCodepointBitmapSubpixel(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int codepoint) +{ + stbtt_MakeGlyphBitmapSubpixel(info, output, out_w, out_h, out_stride, scale_x, scale_y, shift_x, shift_y, stbtt_FindGlyphIndex(info,codepoint)); +} + +STBTT_DEF unsigned char *stbtt_GetCodepointBitmap(const stbtt_fontinfo *info, float scale_x, float scale_y, int codepoint, int *width, int *height, int *xoff, int *yoff) +{ + return stbtt_GetCodepointBitmapSubpixel(info, scale_x, scale_y, 0.0f,0.0f, codepoint, width,height,xoff,yoff); +} + +STBTT_DEF void stbtt_MakeCodepointBitmap(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, int codepoint) +{ + stbtt_MakeCodepointBitmapSubpixel(info, output, out_w, out_h, out_stride, scale_x, scale_y, 0.0f,0.0f, codepoint); +} + +////////////////////////////////////////////////////////////////////////////// +// +// bitmap baking +// +// This is SUPER-CRAPPY packing to keep source code small + +static int stbtt_BakeFontBitmap_internal(unsigned char *data, int offset, // font location (use offset=0 for plain .ttf) + float pixel_height, // height of font in pixels + unsigned char *pixels, int pw, int ph, // bitmap to be filled in + int first_char, int num_chars, // characters to bake + stbtt_bakedchar *chardata) +{ + float scale; + int x,y,bottom_y, i; + stbtt_fontinfo f; + f.userdata = NULL; + if (!stbtt_InitFont(&f, data, offset)) + return -1; + STBTT_memset(pixels, 0, pw*ph); // background of 0 around pixels + x=y=1; + bottom_y = 1; + + scale = stbtt_ScaleForPixelHeight(&f, pixel_height); + + for (i=0; i < num_chars; ++i) { + int advance, lsb, x0,y0,x1,y1,gw,gh; + int g = stbtt_FindGlyphIndex(&f, first_char + i); + stbtt_GetGlyphHMetrics(&f, g, &advance, &lsb); + stbtt_GetGlyphBitmapBox(&f, g, scale,scale, &x0,&y0,&x1,&y1); + gw = x1-x0; + gh = y1-y0; + if (x + gw + 1 >= pw) + y = bottom_y, x = 1; // advance to next row + if (y + gh + 1 >= ph) // check if it fits vertically AFTER potentially moving to next row + return -i; + STBTT_assert(x+gw < pw); + STBTT_assert(y+gh < ph); + stbtt_MakeGlyphBitmap(&f, pixels+x+y*pw, gw,gh,pw, scale,scale, g); + chardata[i].x0 = (stbtt_int16) x; + chardata[i].y0 = (stbtt_int16) y; + chardata[i].x1 = (stbtt_int16) (x + gw); + chardata[i].y1 = (stbtt_int16) (y + gh); + chardata[i].xadvance = scale * advance; + chardata[i].xoff = (float) x0; + chardata[i].yoff = (float) y0; + x = x + gw + 1; + if (y+gh+1 > bottom_y) + bottom_y = y+gh+1; + } + return bottom_y; +} + +STBTT_DEF void stbtt_GetBakedQuad(const stbtt_bakedchar *chardata, int pw, int ph, int char_index, float *xpos, float *ypos, stbtt_aligned_quad *q, int opengl_fillrule) +{ + float d3d_bias = opengl_fillrule ? 0 : -0.5f; + float ipw = 1.0f / pw, iph = 1.0f / ph; + const stbtt_bakedchar *b = chardata + char_index; + int round_x = STBTT_ifloor((*xpos + b->xoff) + 0.5f); + int round_y = STBTT_ifloor((*ypos + b->yoff) + 0.5f); + + q->x0 = round_x + d3d_bias; + q->y0 = round_y + d3d_bias; + q->x1 = round_x + b->x1 - b->x0 + d3d_bias; + q->y1 = round_y + b->y1 - b->y0 + d3d_bias; + + q->s0 = b->x0 * ipw; + q->t0 = b->y0 * iph; + q->s1 = b->x1 * ipw; + q->t1 = b->y1 * iph; + + *xpos += b->xadvance; +} + +////////////////////////////////////////////////////////////////////////////// +// +// rectangle packing replacement routines if you don't have stb_rect_pack.h +// + +#ifndef STB_RECT_PACK_VERSION + +typedef int stbrp_coord; + +//////////////////////////////////////////////////////////////////////////////////// +// // +// // +// COMPILER WARNING ?!?!? // +// // +// // +// if you get a compile warning due to these symbols being defined more than // +// once, move #include "stb_rect_pack.h" before #include "stb_truetype.h" // +// // +//////////////////////////////////////////////////////////////////////////////////// + +typedef struct +{ + int width,height; + int x,y,bottom_y; +} stbrp_context; + +typedef struct +{ + unsigned char x; +} stbrp_node; + +struct stbrp_rect +{ + stbrp_coord x,y; + int id,w,h,was_packed; +}; + +static void stbrp_init_target(stbrp_context *con, int pw, int ph, stbrp_node *nodes, int num_nodes) +{ + con->width = pw; + con->height = ph; + con->x = 0; + con->y = 0; + con->bottom_y = 0; + STBTT__NOTUSED(nodes); + STBTT__NOTUSED(num_nodes); +} + +static void stbrp_pack_rects(stbrp_context *con, stbrp_rect *rects, int num_rects) +{ + int i; + for (i=0; i < num_rects; ++i) { + if (con->x + rects[i].w > con->width) { + con->x = 0; + con->y = con->bottom_y; + } + if (con->y + rects[i].h > con->height) + break; + rects[i].x = con->x; + rects[i].y = con->y; + rects[i].was_packed = 1; + con->x += rects[i].w; + if (con->y + rects[i].h > con->bottom_y) + con->bottom_y = con->y + rects[i].h; + } + for ( ; i < num_rects; ++i) + rects[i].was_packed = 0; +} +#endif + +////////////////////////////////////////////////////////////////////////////// +// +// bitmap baking +// +// This is SUPER-AWESOME (tm Ryan Gordon) packing using stb_rect_pack.h. If +// stb_rect_pack.h isn't available, it uses the BakeFontBitmap strategy. + +STBTT_DEF int stbtt_PackBegin(stbtt_pack_context *spc, unsigned char *pixels, int pw, int ph, int stride_in_bytes, int padding, void *alloc_context) +{ + stbrp_context *context = (stbrp_context *) STBTT_malloc(sizeof(*context) ,alloc_context); + int num_nodes = pw - padding; + stbrp_node *nodes = (stbrp_node *) STBTT_malloc(sizeof(*nodes ) * num_nodes,alloc_context); + + if (context == NULL || nodes == NULL) { + if (context != NULL) STBTT_free(context, alloc_context); + if (nodes != NULL) STBTT_free(nodes , alloc_context); + return 0; + } + + spc->user_allocator_context = alloc_context; + spc->width = pw; + spc->height = ph; + spc->pixels = pixels; + spc->pack_info = context; + spc->nodes = nodes; + spc->padding = padding; + spc->stride_in_bytes = stride_in_bytes != 0 ? stride_in_bytes : pw; + spc->h_oversample = 1; + spc->v_oversample = 1; + spc->skip_missing = 0; + + stbrp_init_target(context, pw-padding, ph-padding, nodes, num_nodes); + + if (pixels) + STBTT_memset(pixels, 0, pw*ph); // background of 0 around pixels + + return 1; +} + +STBTT_DEF void stbtt_PackEnd (stbtt_pack_context *spc) +{ + STBTT_free(spc->nodes , spc->user_allocator_context); + STBTT_free(spc->pack_info, spc->user_allocator_context); +} + +STBTT_DEF void stbtt_PackSetOversampling(stbtt_pack_context *spc, unsigned int h_oversample, unsigned int v_oversample) +{ + STBTT_assert(h_oversample <= STBTT_MAX_OVERSAMPLE); + STBTT_assert(v_oversample <= STBTT_MAX_OVERSAMPLE); + if (h_oversample <= STBTT_MAX_OVERSAMPLE) + spc->h_oversample = h_oversample; + if (v_oversample <= STBTT_MAX_OVERSAMPLE) + spc->v_oversample = v_oversample; +} + +STBTT_DEF void stbtt_PackSetSkipMissingCodepoints(stbtt_pack_context *spc, int skip) +{ + spc->skip_missing = skip; +} + +#define STBTT__OVER_MASK (STBTT_MAX_OVERSAMPLE-1) + +static void stbtt__h_prefilter(unsigned char *pixels, int w, int h, int stride_in_bytes, unsigned int kernel_width) +{ + unsigned char buffer[STBTT_MAX_OVERSAMPLE]; + int safe_w = w - kernel_width; + int j; + STBTT_memset(buffer, 0, STBTT_MAX_OVERSAMPLE); // suppress bogus warning from VS2013 -analyze + for (j=0; j < h; ++j) { + int i; + unsigned int total; + STBTT_memset(buffer, 0, kernel_width); + + total = 0; + + // make kernel_width a constant in common cases so compiler can optimize out the divide + switch (kernel_width) { + case 2: + for (i=0; i <= safe_w; ++i) { + total += pixels[i] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i]; + pixels[i] = (unsigned char) (total / 2); + } + break; + case 3: + for (i=0; i <= safe_w; ++i) { + total += pixels[i] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i]; + pixels[i] = (unsigned char) (total / 3); + } + break; + case 4: + for (i=0; i <= safe_w; ++i) { + total += pixels[i] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i]; + pixels[i] = (unsigned char) (total / 4); + } + break; + case 5: + for (i=0; i <= safe_w; ++i) { + total += pixels[i] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i]; + pixels[i] = (unsigned char) (total / 5); + } + break; + default: + for (i=0; i <= safe_w; ++i) { + total += pixels[i] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i]; + pixels[i] = (unsigned char) (total / kernel_width); + } + break; + } + + for (; i < w; ++i) { + STBTT_assert(pixels[i] == 0); + total -= buffer[i & STBTT__OVER_MASK]; + pixels[i] = (unsigned char) (total / kernel_width); + } + + pixels += stride_in_bytes; + } +} + +static void stbtt__v_prefilter(unsigned char *pixels, int w, int h, int stride_in_bytes, unsigned int kernel_width) +{ + unsigned char buffer[STBTT_MAX_OVERSAMPLE]; + int safe_h = h - kernel_width; + int j; + STBTT_memset(buffer, 0, STBTT_MAX_OVERSAMPLE); // suppress bogus warning from VS2013 -analyze + for (j=0; j < w; ++j) { + int i; + unsigned int total; + STBTT_memset(buffer, 0, kernel_width); + + total = 0; + + // make kernel_width a constant in common cases so compiler can optimize out the divide + switch (kernel_width) { + case 2: + for (i=0; i <= safe_h; ++i) { + total += pixels[i*stride_in_bytes] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i*stride_in_bytes]; + pixels[i*stride_in_bytes] = (unsigned char) (total / 2); + } + break; + case 3: + for (i=0; i <= safe_h; ++i) { + total += pixels[i*stride_in_bytes] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i*stride_in_bytes]; + pixels[i*stride_in_bytes] = (unsigned char) (total / 3); + } + break; + case 4: + for (i=0; i <= safe_h; ++i) { + total += pixels[i*stride_in_bytes] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i*stride_in_bytes]; + pixels[i*stride_in_bytes] = (unsigned char) (total / 4); + } + break; + case 5: + for (i=0; i <= safe_h; ++i) { + total += pixels[i*stride_in_bytes] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i*stride_in_bytes]; + pixels[i*stride_in_bytes] = (unsigned char) (total / 5); + } + break; + default: + for (i=0; i <= safe_h; ++i) { + total += pixels[i*stride_in_bytes] - buffer[i & STBTT__OVER_MASK]; + buffer[(i+kernel_width) & STBTT__OVER_MASK] = pixels[i*stride_in_bytes]; + pixels[i*stride_in_bytes] = (unsigned char) (total / kernel_width); + } + break; + } + + for (; i < h; ++i) { + STBTT_assert(pixels[i*stride_in_bytes] == 0); + total -= buffer[i & STBTT__OVER_MASK]; + pixels[i*stride_in_bytes] = (unsigned char) (total / kernel_width); + } + + pixels += 1; + } +} + +static float stbtt__oversample_shift(int oversample) +{ + if (!oversample) + return 0.0f; + + // The prefilter is a box filter of width "oversample", + // which shifts phase by (oversample - 1)/2 pixels in + // oversampled space. We want to shift in the opposite + // direction to counter this. + return (float)-(oversample - 1) / (2.0f * (float)oversample); +} + +// rects array must be big enough to accommodate all characters in the given ranges +STBTT_DEF int stbtt_PackFontRangesGatherRects(stbtt_pack_context *spc, const stbtt_fontinfo *info, stbtt_pack_range *ranges, int num_ranges, stbrp_rect *rects) +{ + int i,j,k; + int missing_glyph_added = 0; + + k=0; + for (i=0; i < num_ranges; ++i) { + float fh = ranges[i].font_size; + float scale = fh > 0 ? stbtt_ScaleForPixelHeight(info, fh) : stbtt_ScaleForMappingEmToPixels(info, -fh); + ranges[i].h_oversample = (unsigned char) spc->h_oversample; + ranges[i].v_oversample = (unsigned char) spc->v_oversample; + for (j=0; j < ranges[i].num_chars; ++j) { + int x0,y0,x1,y1; + int codepoint = ranges[i].array_of_unicode_codepoints == NULL ? ranges[i].first_unicode_codepoint_in_range + j : ranges[i].array_of_unicode_codepoints[j]; + int glyph = stbtt_FindGlyphIndex(info, codepoint); + if (glyph == 0 && (spc->skip_missing || missing_glyph_added)) { + rects[k].w = rects[k].h = 0; + } else { + stbtt_GetGlyphBitmapBoxSubpixel(info,glyph, + scale * spc->h_oversample, + scale * spc->v_oversample, + 0,0, + &x0,&y0,&x1,&y1); + rects[k].w = (stbrp_coord) (x1-x0 + spc->padding + spc->h_oversample-1); + rects[k].h = (stbrp_coord) (y1-y0 + spc->padding + spc->v_oversample-1); + if (glyph == 0) + missing_glyph_added = 1; + } + ++k; + } + } + + return k; +} + +STBTT_DEF void stbtt_MakeGlyphBitmapSubpixelPrefilter(const stbtt_fontinfo *info, unsigned char *output, int out_w, int out_h, int out_stride, float scale_x, float scale_y, float shift_x, float shift_y, int prefilter_x, int prefilter_y, float *sub_x, float *sub_y, int glyph) +{ + stbtt_MakeGlyphBitmapSubpixel(info, + output, + out_w - (prefilter_x - 1), + out_h - (prefilter_y - 1), + out_stride, + scale_x, + scale_y, + shift_x, + shift_y, + glyph); + + if (prefilter_x > 1) + stbtt__h_prefilter(output, out_w, out_h, out_stride, prefilter_x); + + if (prefilter_y > 1) + stbtt__v_prefilter(output, out_w, out_h, out_stride, prefilter_y); + + *sub_x = stbtt__oversample_shift(prefilter_x); + *sub_y = stbtt__oversample_shift(prefilter_y); +} + +// rects array must be big enough to accommodate all characters in the given ranges +STBTT_DEF int stbtt_PackFontRangesRenderIntoRects(stbtt_pack_context *spc, const stbtt_fontinfo *info, stbtt_pack_range *ranges, int num_ranges, stbrp_rect *rects) +{ + int i,j,k, missing_glyph = -1, return_value = 1; + + // save current values + int old_h_over = spc->h_oversample; + int old_v_over = spc->v_oversample; + + k = 0; + for (i=0; i < num_ranges; ++i) { + float fh = ranges[i].font_size; + float scale = fh > 0 ? stbtt_ScaleForPixelHeight(info, fh) : stbtt_ScaleForMappingEmToPixels(info, -fh); + float recip_h,recip_v,sub_x,sub_y; + spc->h_oversample = ranges[i].h_oversample; + spc->v_oversample = ranges[i].v_oversample; + recip_h = 1.0f / spc->h_oversample; + recip_v = 1.0f / spc->v_oversample; + sub_x = stbtt__oversample_shift(spc->h_oversample); + sub_y = stbtt__oversample_shift(spc->v_oversample); + for (j=0; j < ranges[i].num_chars; ++j) { + stbrp_rect *r = &rects[k]; + if (r->was_packed && r->w != 0 && r->h != 0) { + stbtt_packedchar *bc = &ranges[i].chardata_for_range[j]; + int advance, lsb, x0,y0,x1,y1; + int codepoint = ranges[i].array_of_unicode_codepoints == NULL ? ranges[i].first_unicode_codepoint_in_range + j : ranges[i].array_of_unicode_codepoints[j]; + int glyph = stbtt_FindGlyphIndex(info, codepoint); + stbrp_coord pad = (stbrp_coord) spc->padding; + + // pad on left and top + r->x += pad; + r->y += pad; + r->w -= pad; + r->h -= pad; + stbtt_GetGlyphHMetrics(info, glyph, &advance, &lsb); + stbtt_GetGlyphBitmapBox(info, glyph, + scale * spc->h_oversample, + scale * spc->v_oversample, + &x0,&y0,&x1,&y1); + stbtt_MakeGlyphBitmapSubpixel(info, + spc->pixels + r->x + r->y*spc->stride_in_bytes, + r->w - spc->h_oversample+1, + r->h - spc->v_oversample+1, + spc->stride_in_bytes, + scale * spc->h_oversample, + scale * spc->v_oversample, + 0,0, + glyph); + + if (spc->h_oversample > 1) + stbtt__h_prefilter(spc->pixels + r->x + r->y*spc->stride_in_bytes, + r->w, r->h, spc->stride_in_bytes, + spc->h_oversample); + + if (spc->v_oversample > 1) + stbtt__v_prefilter(spc->pixels + r->x + r->y*spc->stride_in_bytes, + r->w, r->h, spc->stride_in_bytes, + spc->v_oversample); + + bc->x0 = (stbtt_int16) r->x; + bc->y0 = (stbtt_int16) r->y; + bc->x1 = (stbtt_int16) (r->x + r->w); + bc->y1 = (stbtt_int16) (r->y + r->h); + bc->xadvance = scale * advance; + bc->xoff = (float) x0 * recip_h + sub_x; + bc->yoff = (float) y0 * recip_v + sub_y; + bc->xoff2 = (x0 + r->w) * recip_h + sub_x; + bc->yoff2 = (y0 + r->h) * recip_v + sub_y; + + if (glyph == 0) + missing_glyph = j; + } else if (spc->skip_missing) { + return_value = 0; + } else if (r->was_packed && r->w == 0 && r->h == 0 && missing_glyph >= 0) { + ranges[i].chardata_for_range[j] = ranges[i].chardata_for_range[missing_glyph]; + } else { + return_value = 0; // if any fail, report failure + } + + ++k; + } + } + + // restore original values + spc->h_oversample = old_h_over; + spc->v_oversample = old_v_over; + + return return_value; +} + +STBTT_DEF void stbtt_PackFontRangesPackRects(stbtt_pack_context *spc, stbrp_rect *rects, int num_rects) +{ + stbrp_pack_rects((stbrp_context *) spc->pack_info, rects, num_rects); +} + +STBTT_DEF int stbtt_PackFontRanges(stbtt_pack_context *spc, const unsigned char *fontdata, int font_index, stbtt_pack_range *ranges, int num_ranges) +{ + stbtt_fontinfo info; + int i,j,n, return_value = 1; + //stbrp_context *context = (stbrp_context *) spc->pack_info; + stbrp_rect *rects; + + // flag all characters as NOT packed + for (i=0; i < num_ranges; ++i) + for (j=0; j < ranges[i].num_chars; ++j) + ranges[i].chardata_for_range[j].x0 = + ranges[i].chardata_for_range[j].y0 = + ranges[i].chardata_for_range[j].x1 = + ranges[i].chardata_for_range[j].y1 = 0; + + n = 0; + for (i=0; i < num_ranges; ++i) + n += ranges[i].num_chars; + + rects = (stbrp_rect *) STBTT_malloc(sizeof(*rects) * n, spc->user_allocator_context); + if (rects == NULL) + return 0; + + info.userdata = spc->user_allocator_context; + stbtt_InitFont(&info, fontdata, stbtt_GetFontOffsetForIndex(fontdata,font_index)); + + n = stbtt_PackFontRangesGatherRects(spc, &info, ranges, num_ranges, rects); + + stbtt_PackFontRangesPackRects(spc, rects, n); + + return_value = stbtt_PackFontRangesRenderIntoRects(spc, &info, ranges, num_ranges, rects); + + STBTT_free(rects, spc->user_allocator_context); + return return_value; +} + +STBTT_DEF int stbtt_PackFontRange(stbtt_pack_context *spc, const unsigned char *fontdata, int font_index, float font_size, + int first_unicode_codepoint_in_range, int num_chars_in_range, stbtt_packedchar *chardata_for_range) +{ + stbtt_pack_range range; + range.first_unicode_codepoint_in_range = first_unicode_codepoint_in_range; + range.array_of_unicode_codepoints = NULL; + range.num_chars = num_chars_in_range; + range.chardata_for_range = chardata_for_range; + range.font_size = font_size; + return stbtt_PackFontRanges(spc, fontdata, font_index, &range, 1); +} + +STBTT_DEF void stbtt_GetScaledFontVMetrics(const unsigned char *fontdata, int index, float size, float *ascent, float *descent, float *lineGap) +{ + int i_ascent, i_descent, i_lineGap; + float scale; + stbtt_fontinfo info; + stbtt_InitFont(&info, fontdata, stbtt_GetFontOffsetForIndex(fontdata, index)); + scale = size > 0 ? stbtt_ScaleForPixelHeight(&info, size) : stbtt_ScaleForMappingEmToPixels(&info, -size); + stbtt_GetFontVMetrics(&info, &i_ascent, &i_descent, &i_lineGap); + *ascent = (float) i_ascent * scale; + *descent = (float) i_descent * scale; + *lineGap = (float) i_lineGap * scale; +} + +STBTT_DEF void stbtt_GetPackedQuad(const stbtt_packedchar *chardata, int pw, int ph, int char_index, float *xpos, float *ypos, stbtt_aligned_quad *q, int align_to_integer) +{ + float ipw = 1.0f / pw, iph = 1.0f / ph; + const stbtt_packedchar *b = chardata + char_index; + + if (align_to_integer) { + float x = (float) STBTT_ifloor((*xpos + b->xoff) + 0.5f); + float y = (float) STBTT_ifloor((*ypos + b->yoff) + 0.5f); + q->x0 = x; + q->y0 = y; + q->x1 = x + b->xoff2 - b->xoff; + q->y1 = y + b->yoff2 - b->yoff; + } else { + q->x0 = *xpos + b->xoff; + q->y0 = *ypos + b->yoff; + q->x1 = *xpos + b->xoff2; + q->y1 = *ypos + b->yoff2; + } + + q->s0 = b->x0 * ipw; + q->t0 = b->y0 * iph; + q->s1 = b->x1 * ipw; + q->t1 = b->y1 * iph; + + *xpos += b->xadvance; +} + +////////////////////////////////////////////////////////////////////////////// +// +// sdf computation +// + +#define STBTT_min(a,b) ((a) < (b) ? (a) : (b)) +#define STBTT_max(a,b) ((a) < (b) ? (b) : (a)) + +static int stbtt__ray_intersect_bezier(float orig[2], float ray[2], float q0[2], float q1[2], float q2[2], float hits[2][2]) +{ + float q0perp = q0[1]*ray[0] - q0[0]*ray[1]; + float q1perp = q1[1]*ray[0] - q1[0]*ray[1]; + float q2perp = q2[1]*ray[0] - q2[0]*ray[1]; + float roperp = orig[1]*ray[0] - orig[0]*ray[1]; + + float a = q0perp - 2*q1perp + q2perp; + float b = q1perp - q0perp; + float c = q0perp - roperp; + + float s0 = 0., s1 = 0.; + int num_s = 0; + + if (a != 0.0) { + float discr = b*b - a*c; + if (discr > 0.0) { + float rcpna = -1 / a; + float d = (float) STBTT_sqrt(discr); + s0 = (b+d) * rcpna; + s1 = (b-d) * rcpna; + if (s0 >= 0.0 && s0 <= 1.0) + num_s = 1; + if (d > 0.0 && s1 >= 0.0 && s1 <= 1.0) { + if (num_s == 0) s0 = s1; + ++num_s; + } + } + } else { + // 2*b*s + c = 0 + // s = -c / (2*b) + s0 = c / (-2 * b); + if (s0 >= 0.0 && s0 <= 1.0) + num_s = 1; + } + + if (num_s == 0) + return 0; + else { + float rcp_len2 = 1 / (ray[0]*ray[0] + ray[1]*ray[1]); + float rayn_x = ray[0] * rcp_len2, rayn_y = ray[1] * rcp_len2; + + float q0d = q0[0]*rayn_x + q0[1]*rayn_y; + float q1d = q1[0]*rayn_x + q1[1]*rayn_y; + float q2d = q2[0]*rayn_x + q2[1]*rayn_y; + float rod = orig[0]*rayn_x + orig[1]*rayn_y; + + float q10d = q1d - q0d; + float q20d = q2d - q0d; + float q0rd = q0d - rod; + + hits[0][0] = q0rd + s0*(2.0f - 2.0f*s0)*q10d + s0*s0*q20d; + hits[0][1] = a*s0+b; + + if (num_s > 1) { + hits[1][0] = q0rd + s1*(2.0f - 2.0f*s1)*q10d + s1*s1*q20d; + hits[1][1] = a*s1+b; + return 2; + } else { + return 1; + } + } +} + +static int equal(float *a, float *b) +{ + return (a[0] == b[0] && a[1] == b[1]); +} + +static int stbtt__compute_crossings_x(float x, float y, int nverts, stbtt_vertex *verts) +{ + int i; + float orig[2], ray[2] = { 1, 0 }; + float y_frac; + int winding = 0; + + // make sure y never passes through a vertex of the shape + y_frac = (float) STBTT_fmod(y, 1.0f); + if (y_frac < 0.01f) + y += 0.01f; + else if (y_frac > 0.99f) + y -= 0.01f; + + orig[0] = x; + orig[1] = y; + + // test a ray from (-infinity,y) to (x,y) + for (i=0; i < nverts; ++i) { + if (verts[i].type == STBTT_vline) { + int x0 = (int) verts[i-1].x, y0 = (int) verts[i-1].y; + int x1 = (int) verts[i ].x, y1 = (int) verts[i ].y; + if (y > STBTT_min(y0,y1) && y < STBTT_max(y0,y1) && x > STBTT_min(x0,x1)) { + float x_inter = (y - y0) / (y1 - y0) * (x1-x0) + x0; + if (x_inter < x) + winding += (y0 < y1) ? 1 : -1; + } + } + if (verts[i].type == STBTT_vcurve) { + int x0 = (int) verts[i-1].x , y0 = (int) verts[i-1].y ; + int x1 = (int) verts[i ].cx, y1 = (int) verts[i ].cy; + int x2 = (int) verts[i ].x , y2 = (int) verts[i ].y ; + int ax = STBTT_min(x0,STBTT_min(x1,x2)), ay = STBTT_min(y0,STBTT_min(y1,y2)); + int by = STBTT_max(y0,STBTT_max(y1,y2)); + if (y > ay && y < by && x > ax) { + float q0[2],q1[2],q2[2]; + float hits[2][2]; + q0[0] = (float)x0; + q0[1] = (float)y0; + q1[0] = (float)x1; + q1[1] = (float)y1; + q2[0] = (float)x2; + q2[1] = (float)y2; + if (equal(q0,q1) || equal(q1,q2)) { + x0 = (int)verts[i-1].x; + y0 = (int)verts[i-1].y; + x1 = (int)verts[i ].x; + y1 = (int)verts[i ].y; + if (y > STBTT_min(y0,y1) && y < STBTT_max(y0,y1) && x > STBTT_min(x0,x1)) { + float x_inter = (y - y0) / (y1 - y0) * (x1-x0) + x0; + if (x_inter < x) + winding += (y0 < y1) ? 1 : -1; + } + } else { + int num_hits = stbtt__ray_intersect_bezier(orig, ray, q0, q1, q2, hits); + if (num_hits >= 1) + if (hits[0][0] < 0) + winding += (hits[0][1] < 0 ? -1 : 1); + if (num_hits >= 2) + if (hits[1][0] < 0) + winding += (hits[1][1] < 0 ? -1 : 1); + } + } + } + } + return winding; +} + +static float stbtt__cuberoot( float x ) +{ + if (x<0) + return -(float) STBTT_pow(-x,1.0f/3.0f); + else + return (float) STBTT_pow( x,1.0f/3.0f); +} + +// x^3 + a*x^2 + b*x + c = 0 +static int stbtt__solve_cubic(float a, float b, float c, float* r) +{ + float s = -a / 3; + float p = b - a*a / 3; + float q = a * (2*a*a - 9*b) / 27 + c; + float p3 = p*p*p; + float d = q*q + 4*p3 / 27; + if (d >= 0) { + float z = (float) STBTT_sqrt(d); + float u = (-q + z) / 2; + float v = (-q - z) / 2; + u = stbtt__cuberoot(u); + v = stbtt__cuberoot(v); + r[0] = s + u + v; + return 1; + } else { + float u = (float) STBTT_sqrt(-p/3); + float v = (float) STBTT_acos(-STBTT_sqrt(-27/p3) * q / 2) / 3; // p3 must be negative, since d is negative + float m = (float) STBTT_cos(v); + float n = (float) STBTT_cos(v-3.141592/2)*1.732050808f; + r[0] = s + u * 2 * m; + r[1] = s - u * (m + n); + r[2] = s - u * (m - n); + + //STBTT_assert( STBTT_fabs(((r[0]+a)*r[0]+b)*r[0]+c) < 0.05f); // these asserts may not be safe at all scales, though they're in bezier t parameter units so maybe? + //STBTT_assert( STBTT_fabs(((r[1]+a)*r[1]+b)*r[1]+c) < 0.05f); + //STBTT_assert( STBTT_fabs(((r[2]+a)*r[2]+b)*r[2]+c) < 0.05f); + return 3; + } +} + +STBTT_DEF unsigned char * stbtt_GetGlyphSDF(const stbtt_fontinfo *info, float scale, int glyph, int padding, unsigned char onedge_value, float pixel_dist_scale, int *width, int *height, int *xoff, int *yoff) +{ + float scale_x = scale, scale_y = scale; + int ix0,iy0,ix1,iy1; + int w,h; + unsigned char *data; + + if (scale == 0) return NULL; + + stbtt_GetGlyphBitmapBoxSubpixel(info, glyph, scale, scale, 0.0f,0.0f, &ix0,&iy0,&ix1,&iy1); + + // if empty, return NULL + if (ix0 == ix1 || iy0 == iy1) + return NULL; + + ix0 -= padding; + iy0 -= padding; + ix1 += padding; + iy1 += padding; + + w = (ix1 - ix0); + h = (iy1 - iy0); + + if (width ) *width = w; + if (height) *height = h; + if (xoff ) *xoff = ix0; + if (yoff ) *yoff = iy0; + + // invert for y-downwards bitmaps + scale_y = -scale_y; + + { + int x,y,i,j; + float *precompute; + stbtt_vertex *verts; + int num_verts = stbtt_GetGlyphShape(info, glyph, &verts); + data = (unsigned char *) STBTT_malloc(w * h, info->userdata); + precompute = (float *) STBTT_malloc(num_verts * sizeof(float), info->userdata); + + for (i=0,j=num_verts-1; i < num_verts; j=i++) { + if (verts[i].type == STBTT_vline) { + float x0 = verts[i].x*scale_x, y0 = verts[i].y*scale_y; + float x1 = verts[j].x*scale_x, y1 = verts[j].y*scale_y; + float dist = (float) STBTT_sqrt((x1-x0)*(x1-x0) + (y1-y0)*(y1-y0)); + precompute[i] = (dist == 0) ? 0.0f : 1.0f / dist; + } else if (verts[i].type == STBTT_vcurve) { + float x2 = verts[j].x *scale_x, y2 = verts[j].y *scale_y; + float x1 = verts[i].cx*scale_x, y1 = verts[i].cy*scale_y; + float x0 = verts[i].x *scale_x, y0 = verts[i].y *scale_y; + float bx = x0 - 2*x1 + x2, by = y0 - 2*y1 + y2; + float len2 = bx*bx + by*by; + if (len2 != 0.0f) + precompute[i] = 1.0f / (bx*bx + by*by); + else + precompute[i] = 0.0f; + } else + precompute[i] = 0.0f; + } + + for (y=iy0; y < iy1; ++y) { + for (x=ix0; x < ix1; ++x) { + float val; + float min_dist = 999999.0f; + float sx = (float) x + 0.5f; + float sy = (float) y + 0.5f; + float x_gspace = (sx / scale_x); + float y_gspace = (sy / scale_y); + + int winding = stbtt__compute_crossings_x(x_gspace, y_gspace, num_verts, verts); // @OPTIMIZE: this could just be a rasterization, but needs to be line vs. non-tesselated curves so a new path + + for (i=0; i < num_verts; ++i) { + float x0 = verts[i].x*scale_x, y0 = verts[i].y*scale_y; + + if (verts[i].type == STBTT_vline && precompute[i] != 0.0f) { + float x1 = verts[i-1].x*scale_x, y1 = verts[i-1].y*scale_y; + + float dist,dist2 = (x0-sx)*(x0-sx) + (y0-sy)*(y0-sy); + if (dist2 < min_dist*min_dist) + min_dist = (float) STBTT_sqrt(dist2); + + // coarse culling against bbox + //if (sx > STBTT_min(x0,x1)-min_dist && sx < STBTT_max(x0,x1)+min_dist && + // sy > STBTT_min(y0,y1)-min_dist && sy < STBTT_max(y0,y1)+min_dist) + dist = (float) STBTT_fabs((x1-x0)*(y0-sy) - (y1-y0)*(x0-sx)) * precompute[i]; + STBTT_assert(i != 0); + if (dist < min_dist) { + // check position along line + // x' = x0 + t*(x1-x0), y' = y0 + t*(y1-y0) + // minimize (x'-sx)*(x'-sx)+(y'-sy)*(y'-sy) + float dx = x1-x0, dy = y1-y0; + float px = x0-sx, py = y0-sy; + // minimize (px+t*dx)^2 + (py+t*dy)^2 = px*px + 2*px*dx*t + t^2*dx*dx + py*py + 2*py*dy*t + t^2*dy*dy + // derivative: 2*px*dx + 2*py*dy + (2*dx*dx+2*dy*dy)*t, set to 0 and solve + float t = -(px*dx + py*dy) / (dx*dx + dy*dy); + if (t >= 0.0f && t <= 1.0f) + min_dist = dist; + } + } else if (verts[i].type == STBTT_vcurve) { + float x2 = verts[i-1].x *scale_x, y2 = verts[i-1].y *scale_y; + float x1 = verts[i ].cx*scale_x, y1 = verts[i ].cy*scale_y; + float box_x0 = STBTT_min(STBTT_min(x0,x1),x2); + float box_y0 = STBTT_min(STBTT_min(y0,y1),y2); + float box_x1 = STBTT_max(STBTT_max(x0,x1),x2); + float box_y1 = STBTT_max(STBTT_max(y0,y1),y2); + // coarse culling against bbox to avoid computing cubic unnecessarily + if (sx > box_x0-min_dist && sx < box_x1+min_dist && sy > box_y0-min_dist && sy < box_y1+min_dist) { + int num=0; + float ax = x1-x0, ay = y1-y0; + float bx = x0 - 2*x1 + x2, by = y0 - 2*y1 + y2; + float mx = x0 - sx, my = y0 - sy; + float res[3] = {0.f,0.f,0.f}; + float px,py,t,it,dist2; + float a_inv = precompute[i]; + if (a_inv == 0.0) { // if a_inv is 0, it's 2nd degree so use quadratic formula + float a = 3*(ax*bx + ay*by); + float b = 2*(ax*ax + ay*ay) + (mx*bx+my*by); + float c = mx*ax+my*ay; + if (a == 0.0) { // if a is 0, it's linear + if (b != 0.0) { + res[num++] = -c/b; + } + } else { + float discriminant = b*b - 4*a*c; + if (discriminant < 0) + num = 0; + else { + float root = (float) STBTT_sqrt(discriminant); + res[0] = (-b - root)/(2*a); + res[1] = (-b + root)/(2*a); + num = 2; // don't bother distinguishing 1-solution case, as code below will still work + } + } + } else { + float b = 3*(ax*bx + ay*by) * a_inv; // could precompute this as it doesn't depend on sample point + float c = (2*(ax*ax + ay*ay) + (mx*bx+my*by)) * a_inv; + float d = (mx*ax+my*ay) * a_inv; + num = stbtt__solve_cubic(b, c, d, res); + } + dist2 = (x0-sx)*(x0-sx) + (y0-sy)*(y0-sy); + if (dist2 < min_dist*min_dist) + min_dist = (float) STBTT_sqrt(dist2); + + if (num >= 1 && res[0] >= 0.0f && res[0] <= 1.0f) { + t = res[0], it = 1.0f - t; + px = it*it*x0 + 2*t*it*x1 + t*t*x2; + py = it*it*y0 + 2*t*it*y1 + t*t*y2; + dist2 = (px-sx)*(px-sx) + (py-sy)*(py-sy); + if (dist2 < min_dist * min_dist) + min_dist = (float) STBTT_sqrt(dist2); + } + if (num >= 2 && res[1] >= 0.0f && res[1] <= 1.0f) { + t = res[1], it = 1.0f - t; + px = it*it*x0 + 2*t*it*x1 + t*t*x2; + py = it*it*y0 + 2*t*it*y1 + t*t*y2; + dist2 = (px-sx)*(px-sx) + (py-sy)*(py-sy); + if (dist2 < min_dist * min_dist) + min_dist = (float) STBTT_sqrt(dist2); + } + if (num >= 3 && res[2] >= 0.0f && res[2] <= 1.0f) { + t = res[2], it = 1.0f - t; + px = it*it*x0 + 2*t*it*x1 + t*t*x2; + py = it*it*y0 + 2*t*it*y1 + t*t*y2; + dist2 = (px-sx)*(px-sx) + (py-sy)*(py-sy); + if (dist2 < min_dist * min_dist) + min_dist = (float) STBTT_sqrt(dist2); + } + } + } + } + if (winding == 0) + min_dist = -min_dist; // if outside the shape, value is negative + val = onedge_value + pixel_dist_scale * min_dist; + if (val < 0) + val = 0; + else if (val > 255) + val = 255; + data[(y-iy0)*w+(x-ix0)] = (unsigned char) val; + } + } + STBTT_free(precompute, info->userdata); + STBTT_free(verts, info->userdata); + } + return data; +} + +STBTT_DEF unsigned char * stbtt_GetCodepointSDF(const stbtt_fontinfo *info, float scale, int codepoint, int padding, unsigned char onedge_value, float pixel_dist_scale, int *width, int *height, int *xoff, int *yoff) +{ + return stbtt_GetGlyphSDF(info, scale, stbtt_FindGlyphIndex(info, codepoint), padding, onedge_value, pixel_dist_scale, width, height, xoff, yoff); +} + +STBTT_DEF void stbtt_FreeSDF(unsigned char *bitmap, void *userdata) +{ + STBTT_free(bitmap, userdata); +} + +////////////////////////////////////////////////////////////////////////////// +// +// font name matching -- recommended not to use this +// + +// check if a utf8 string contains a prefix which is the utf16 string; if so return length of matching utf8 string +static stbtt_int32 stbtt__CompareUTF8toUTF16_bigendian_prefix(stbtt_uint8 *s1, stbtt_int32 len1, stbtt_uint8 *s2, stbtt_int32 len2) +{ + stbtt_int32 i=0; + + // convert utf16 to utf8 and compare the results while converting + while (len2) { + stbtt_uint16 ch = s2[0]*256 + s2[1]; + if (ch < 0x80) { + if (i >= len1) return -1; + if (s1[i++] != ch) return -1; + } else if (ch < 0x800) { + if (i+1 >= len1) return -1; + if (s1[i++] != 0xc0 + (ch >> 6)) return -1; + if (s1[i++] != 0x80 + (ch & 0x3f)) return -1; + } else if (ch >= 0xd800 && ch < 0xdc00) { + stbtt_uint32 c; + stbtt_uint16 ch2 = s2[2]*256 + s2[3]; + if (i+3 >= len1) return -1; + c = ((ch - 0xd800) << 10) + (ch2 - 0xdc00) + 0x10000; + if (s1[i++] != 0xf0 + (c >> 18)) return -1; + if (s1[i++] != 0x80 + ((c >> 12) & 0x3f)) return -1; + if (s1[i++] != 0x80 + ((c >> 6) & 0x3f)) return -1; + if (s1[i++] != 0x80 + ((c ) & 0x3f)) return -1; + s2 += 2; // plus another 2 below + len2 -= 2; + } else if (ch >= 0xdc00 && ch < 0xe000) { + return -1; + } else { + if (i+2 >= len1) return -1; + if (s1[i++] != 0xe0 + (ch >> 12)) return -1; + if (s1[i++] != 0x80 + ((ch >> 6) & 0x3f)) return -1; + if (s1[i++] != 0x80 + ((ch ) & 0x3f)) return -1; + } + s2 += 2; + len2 -= 2; + } + return i; +} + +static int stbtt_CompareUTF8toUTF16_bigendian_internal(char *s1, int len1, char *s2, int len2) +{ + return len1 == stbtt__CompareUTF8toUTF16_bigendian_prefix((stbtt_uint8*) s1, len1, (stbtt_uint8*) s2, len2); +} + +// returns results in whatever encoding you request... but note that 2-byte encodings +// will be BIG-ENDIAN... use stbtt_CompareUTF8toUTF16_bigendian() to compare +STBTT_DEF const char *stbtt_GetFontNameString(const stbtt_fontinfo *font, int *length, int platformID, int encodingID, int languageID, int nameID) +{ + stbtt_int32 i,count,stringOffset; + stbtt_uint8 *fc = font->data; + stbtt_uint32 offset = font->fontstart; + stbtt_uint32 nm = stbtt__find_table(fc, offset, "name"); + if (!nm) return NULL; + + count = ttUSHORT(fc+nm+2); + stringOffset = nm + ttUSHORT(fc+nm+4); + for (i=0; i < count; ++i) { + stbtt_uint32 loc = nm + 6 + 12 * i; + if (platformID == ttUSHORT(fc+loc+0) && encodingID == ttUSHORT(fc+loc+2) + && languageID == ttUSHORT(fc+loc+4) && nameID == ttUSHORT(fc+loc+6)) { + *length = ttUSHORT(fc+loc+8); + return (const char *) (fc+stringOffset+ttUSHORT(fc+loc+10)); + } + } + return NULL; +} + +static int stbtt__matchpair(stbtt_uint8 *fc, stbtt_uint32 nm, stbtt_uint8 *name, stbtt_int32 nlen, stbtt_int32 target_id, stbtt_int32 next_id) +{ + stbtt_int32 i; + stbtt_int32 count = ttUSHORT(fc+nm+2); + stbtt_int32 stringOffset = nm + ttUSHORT(fc+nm+4); + + for (i=0; i < count; ++i) { + stbtt_uint32 loc = nm + 6 + 12 * i; + stbtt_int32 id = ttUSHORT(fc+loc+6); + if (id == target_id) { + // find the encoding + stbtt_int32 platform = ttUSHORT(fc+loc+0), encoding = ttUSHORT(fc+loc+2), language = ttUSHORT(fc+loc+4); + + // is this a Unicode encoding? + if (platform == 0 || (platform == 3 && encoding == 1) || (platform == 3 && encoding == 10)) { + stbtt_int32 slen = ttUSHORT(fc+loc+8); + stbtt_int32 off = ttUSHORT(fc+loc+10); + + // check if there's a prefix match + stbtt_int32 matchlen = stbtt__CompareUTF8toUTF16_bigendian_prefix(name, nlen, fc+stringOffset+off,slen); + if (matchlen >= 0) { + // check for target_id+1 immediately following, with same encoding & language + if (i+1 < count && ttUSHORT(fc+loc+12+6) == next_id && ttUSHORT(fc+loc+12) == platform && ttUSHORT(fc+loc+12+2) == encoding && ttUSHORT(fc+loc+12+4) == language) { + slen = ttUSHORT(fc+loc+12+8); + off = ttUSHORT(fc+loc+12+10); + if (slen == 0) { + if (matchlen == nlen) + return 1; + } else if (matchlen < nlen && name[matchlen] == ' ') { + ++matchlen; + if (stbtt_CompareUTF8toUTF16_bigendian_internal((char*) (name+matchlen), nlen-matchlen, (char*)(fc+stringOffset+off),slen)) + return 1; + } + } else { + // if nothing immediately following + if (matchlen == nlen) + return 1; + } + } + } + + // @TODO handle other encodings + } + } + return 0; +} + +static int stbtt__matches(stbtt_uint8 *fc, stbtt_uint32 offset, stbtt_uint8 *name, stbtt_int32 flags) +{ + stbtt_int32 nlen = (stbtt_int32) STBTT_strlen((char *) name); + stbtt_uint32 nm,hd; + if (!stbtt__isfont(fc+offset)) return 0; + + // check italics/bold/underline flags in macStyle... + if (flags) { + hd = stbtt__find_table(fc, offset, "head"); + if ((ttUSHORT(fc+hd+44) & 7) != (flags & 7)) return 0; + } + + nm = stbtt__find_table(fc, offset, "name"); + if (!nm) return 0; + + if (flags) { + // if we checked the macStyle flags, then just check the family and ignore the subfamily + if (stbtt__matchpair(fc, nm, name, nlen, 16, -1)) return 1; + if (stbtt__matchpair(fc, nm, name, nlen, 1, -1)) return 1; + if (stbtt__matchpair(fc, nm, name, nlen, 3, -1)) return 1; + } else { + if (stbtt__matchpair(fc, nm, name, nlen, 16, 17)) return 1; + if (stbtt__matchpair(fc, nm, name, nlen, 1, 2)) return 1; + if (stbtt__matchpair(fc, nm, name, nlen, 3, -1)) return 1; + } + + return 0; +} + +static int stbtt_FindMatchingFont_internal(unsigned char *font_collection, char *name_utf8, stbtt_int32 flags) +{ + stbtt_int32 i; + for (i=0;;++i) { + stbtt_int32 off = stbtt_GetFontOffsetForIndex(font_collection, i); + if (off < 0) return off; + if (stbtt__matches((stbtt_uint8 *) font_collection, off, (stbtt_uint8*) name_utf8, flags)) + return off; + } +} + +#if defined(__GNUC__) || defined(__clang__) +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wcast-qual" +#endif + +STBTT_DEF int stbtt_BakeFontBitmap(const unsigned char *data, int offset, + float pixel_height, unsigned char *pixels, int pw, int ph, + int first_char, int num_chars, stbtt_bakedchar *chardata) +{ + return stbtt_BakeFontBitmap_internal((unsigned char *) data, offset, pixel_height, pixels, pw, ph, first_char, num_chars, chardata); +} + +STBTT_DEF int stbtt_GetFontOffsetForIndex(const unsigned char *data, int index) +{ + return stbtt_GetFontOffsetForIndex_internal((unsigned char *) data, index); +} + +STBTT_DEF int stbtt_GetNumberOfFonts(const unsigned char *data) +{ + return stbtt_GetNumberOfFonts_internal((unsigned char *) data); +} + +STBTT_DEF int stbtt_InitFont(stbtt_fontinfo *info, const unsigned char *data, int offset) +{ + return stbtt_InitFont_internal(info, (unsigned char *) data, offset); +} + +STBTT_DEF int stbtt_FindMatchingFont(const unsigned char *fontdata, const char *name, int flags) +{ + return stbtt_FindMatchingFont_internal((unsigned char *) fontdata, (char *) name, flags); +} + +STBTT_DEF int stbtt_CompareUTF8toUTF16_bigendian(const char *s1, int len1, const char *s2, int len2) +{ + return stbtt_CompareUTF8toUTF16_bigendian_internal((char *) s1, len1, (char *) s2, len2); +} + +#if defined(__GNUC__) || defined(__clang__) +#pragma GCC diagnostic pop +#endif + +#endif // STB_TRUETYPE_IMPLEMENTATION + + +// FULL VERSION HISTORY +// +// 1.25 (2021-07-11) many fixes +// 1.24 (2020-02-05) fix warning +// 1.23 (2020-02-02) query SVG data for glyphs; query whole kerning table (but only kern not GPOS) +// 1.22 (2019-08-11) minimize missing-glyph duplication; fix kerning if both 'GPOS' and 'kern' are defined +// 1.21 (2019-02-25) fix warning +// 1.20 (2019-02-07) PackFontRange skips missing codepoints; GetScaleFontVMetrics() +// 1.19 (2018-02-11) OpenType GPOS kerning (horizontal only), STBTT_fmod +// 1.18 (2018-01-29) add missing function +// 1.17 (2017-07-23) make more arguments const; doc fix +// 1.16 (2017-07-12) SDF support +// 1.15 (2017-03-03) make more arguments const +// 1.14 (2017-01-16) num-fonts-in-TTC function +// 1.13 (2017-01-02) support OpenType fonts, certain Apple fonts +// 1.12 (2016-10-25) suppress warnings about casting away const with -Wcast-qual +// 1.11 (2016-04-02) fix unused-variable warning +// 1.10 (2016-04-02) allow user-defined fabs() replacement +// fix memory leak if fontsize=0.0 +// fix warning from duplicate typedef +// 1.09 (2016-01-16) warning fix; avoid crash on outofmem; use alloc userdata for PackFontRanges +// 1.08 (2015-09-13) document stbtt_Rasterize(); fixes for vertical & horizontal edges +// 1.07 (2015-08-01) allow PackFontRanges to accept arrays of sparse codepoints; +// allow PackFontRanges to pack and render in separate phases; +// fix stbtt_GetFontOFfsetForIndex (never worked for non-0 input?); +// fixed an assert() bug in the new rasterizer +// replace assert() with STBTT_assert() in new rasterizer +// 1.06 (2015-07-14) performance improvements (~35% faster on x86 and x64 on test machine) +// also more precise AA rasterizer, except if shapes overlap +// remove need for STBTT_sort +// 1.05 (2015-04-15) fix misplaced definitions for STBTT_STATIC +// 1.04 (2015-04-15) typo in example +// 1.03 (2015-04-12) STBTT_STATIC, fix memory leak in new packing, various fixes +// 1.02 (2014-12-10) fix various warnings & compile issues w/ stb_rect_pack, C++ +// 1.01 (2014-12-08) fix subpixel position when oversampling to exactly match +// non-oversampled; STBTT_POINT_SIZE for packed case only +// 1.00 (2014-12-06) add new PackBegin etc. API, w/ support for oversampling +// 0.99 (2014-09-18) fix multiple bugs with subpixel rendering (ryg) +// 0.9 (2014-08-07) support certain mac/iOS fonts without an MS platformID +// 0.8b (2014-07-07) fix a warning +// 0.8 (2014-05-25) fix a few more warnings +// 0.7 (2013-09-25) bugfix: subpixel glyph bug fixed in 0.5 had come back +// 0.6c (2012-07-24) improve documentation +// 0.6b (2012-07-20) fix a few more warnings +// 0.6 (2012-07-17) fix warnings; added stbtt_ScaleForMappingEmToPixels, +// stbtt_GetFontBoundingBox, stbtt_IsGlyphEmpty +// 0.5 (2011-12-09) bugfixes: +// subpixel glyph renderer computed wrong bounding box +// first vertex of shape can be off-curve (FreeSans) +// 0.4b (2011-12-03) fixed an error in the font baking example +// 0.4 (2011-12-01) kerning, subpixel rendering (tor) +// bugfixes for: +// codepoint-to-glyph conversion using table fmt=12 +// codepoint-to-glyph conversion using table fmt=4 +// stbtt_GetBakedQuad with non-square texture (Zer) +// updated Hello World! sample to use kerning and subpixel +// fixed some warnings +// 0.3 (2009-06-24) cmap fmt=12, compound shapes (MM) +// userdata, malloc-from-userdata, non-zero fill (stb) +// 0.2 (2009-03-11) Fix unsigned/signed char warnings +// 0.1 (2009-03-09) First public release +// + +/* +------------------------------------------------------------------------------ +This software is available under 2 licenses -- choose whichever you prefer. +------------------------------------------------------------------------------ +ALTERNATIVE A - MIT License +Copyright (c) 2017 Sean Barrett +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +------------------------------------------------------------------------------ +ALTERNATIVE B - Public Domain (www.unlicense.org) +This is free and unencumbered software released into the public domain. +Anyone is free to copy, modify, publish, use, compile, sell, or distribute this +software, either in source code form or as a compiled binary, for any purpose, +commercial or non-commercial, and by any means. +In jurisdictions that recognize copyright laws, the author or authors of this +software dedicate any and all copyright interest in the software to the public +domain. We make this dedication for the benefit of the public at large and to +the detriment of our heirs and successors. We intend this dedication to be an +overt act of relinquishment in perpetuity of all present and future rights to +this software under copyright law. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +------------------------------------------------------------------------------ +*/ diff --git a/modules/v4d/tutorials/00-intro.markdown b/modules/v4d/tutorials/00-intro.markdown new file mode 100644 index 000000000..21d9514d7 --- /dev/null +++ b/modules/v4d/tutorials/00-intro.markdown @@ -0,0 +1,174 @@ +# V4D {#v4d} + +[TOC] + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +# What is V4D? +V4D offers a way of writing graphical (on- and offscreen) high performance applications with OpenCV. It is light-weight and unencumbered by QT or GTK licenses. It features vector graphics using [NanoVG](https://github.com/inniyah/nanovg) a GUI based on [ImGUI](https://github.com/ocornut/imgui) and (on supported systems) OpenCL/OpenGL and OpenCL/VAAPI interoperability. It should be included in [OpenCV-contrib](https://github.com/opencv/opencv_contrib) once it is ready. + +# Why V4D? +Please refer to the online demos in the \ref v4d_tutorials and \ref v4d_demos section to see at a glance what it can do for you. **But note**: The online demos are slower than native builds and are sometimes missing features. If you want full performance (including hardware acceleration) you should really create a native build and test it. + +* **OpenGL**: Easy access to OpenGL. +* **GUI**: Simple yet powerful user interfaces through ImGui. +* **Vector graphics**: Elegant and fast vector graphics through NanoVG. +* **Font rendering**: Loading of fonts and sophisticated rendering options. +* **Video pipeline**: Through a simple source/sink system videos can be efficently read, displayed, edited and saved. +* **Hardware acceleration**: Transparent hardware acceleration usage where possible. (e.g. CL-GL interop, VAAPI and CL-VAAPI interop). Actually it is possible to write programs that run almost entirely on the GPU, given driver-features are available. +* **No more highgui** with it's heavy dependencies, licenses and limitations. +* **\ref v4d_webassembly_support**. + +# Design Notes +* V4D is not thread safe. Though it is possible to have several V4D objects in one or more threads and synchronize them using ```V4D::makeCurrent()```. This is a limitation of GLFW3/EGL. That said, OpenCV algorithms are multi-threaded as usual. +* V4D uses InputArray/OutputArray/InputOutputArray which gives you the option to work with Mat, std::vector and UMat. Anyway, you should prefer to use UMat whenever possible to automatically use hardware capabilities where available. +* Access to different subsystems (opengl, framebuffer, nanovg and imgui) is provided through "contexts". A context is simply a function that takes a functor, sets up the subsystem, executes the functor and tears-down the subsystem. +* ```V4D::run``` is not a context. It is an abstraction of a run loop that takes a functor and runs until the application terminates or the functor returns false. This is necessary for portability reasons. +* Contexts ***may not*** be nested. + +For example, to create an OpenGL context and set the GL viewport: +@code{.cpp} +//Creates a V4D object for on screen rendering +Ptr v4d = V4D::make(Size(WIDTH, HEIGHT), "GL viewport"); + +//Takes care of OpenGL states in the background +v4d->gl([](const Size sz) { + glViewPort(0, 0, sz.width, sz.height); +}); +@endcode + +# GPU Support +* Intel Gen 8+ (Tested: Gen 11 + Gen 13) is supported best +* NVIDIA Ada Lovelace (Tested: GTX 4070 Ti) with proprietary drivers (535.104.05) and CUDA toolkit (12.2) works but video writing is very slow, unless: you change the codec to H264 or you create a gstreamer sink using nvenc. +* AMD: never tested + +# Requirements +* C++20 (at the moment) +* OpenGL 3.2 Core (optionally Compat)/OpenGL ES 3.0/WebGL2 + +# Optional requirements +* Support for OpenCL 1.2 +* Support for cl_khr_gl_sharing and cl_intel_va_api_media_sharing OpenCL extensions. + +# Dependencies +* [My OpenCV 4.x fork](https://github.com/kallaballa/opencv) (It works with mainline OpenCV 4.x as well, but will miss some features) +* GLEW +* GLFW3 +* NanoVG (included as a sub-repo) +* ImGui (included as a sub-repo) + +# Optional: Dependencies for demos +* (At the time of writing) If you want CL-GL interop on a recent Intel Platform you might need to build [compute-runtime](https://github.com/intel/compute-runtime). The first version of compute-runtime shipping CL-GL interop is **23.13.26032** + +# Tutorials {#v4d_tutorials} +The tutorials are designed to be read one after the other to give you a good overview over the key concepts of V4D. After that you can move on to the demos. + +* \ref v4d_display_image_pipeline +* \ref v4d_display_image_fb +* \ref v4d_display_image_nvg +* \ref v4d_vector_graphics +* \ref v4d_vector_graphics_and_fb +* \ref v4d_render_opengl +* \ref v4d_font_rendering +* \ref v4d_video_editing +* \ref v4d_custom_source_and_sink +* \ref v4d_font_with_gui + +# Demos {#v4d_demos} +The goal of the demos is to show how to use V4D to the fullest. Also they show how to use V4D to create programs that run mostly (the part the matters) on the GPU (when driver capabilities allow). They are also a good starting point for your own applications because they touch many key aspects and algorithms of OpenCV. + +* \ref v4d_cube +* \ref v4d_many_cubes +* \ref v4d_video +* \ref v4d_nanovg +* \ref v4d_shader +* \ref v4d_font +* \ref v4d_pedestrian +* \ref v4d_optflow +* \ref v4d_beauty + +# Instructions for Ubuntu 22.04.2 LTS +You need to build OpenCV with V4D + +## Install required packages + +```bash +apt install vainfo clinfo libqt5opengl5-dev freeglut3-dev ocl-icd-opencl-dev libavcodec-dev libavdevice-dev libavfilter-dev libavformat-dev libavutil-dev libpostproc-dev libswresample-dev libswscale-dev libglfw3-dev libstb-dev libglew-dev cmake make git-core build-essential opencl-clhpp-headers pkg-config zlib1g-dev doxygen libxinerama-dev libxcursor-dev libxi-dev libva-dev yt-dlp wget intel-opencl-icd ca-certificates +``` +## Optional: Install if you want to build your own packages +```bash +apt install ubuntu-dev-tools dh-cmake gdebi +``` +## EITHER: Minimal V4D build without examples and demos + +```bash +git clone --branch GCV https://github.com/kallaballa/opencv.git +git clone https://github.com/kallaballa/V4D.git +mkdir opencv/build +cd opencv/build + cmake -DCMAKE_BUILD_TYPE=Release -DCV_TRACE=OFF -DBUILD_SHARED_LIBS=ON -DWITH_OPENGL=ON -DOPENCV_ENABLE_EGL=ON -DOPENCV_FFMPEG_ENABLE_LIBAVDEVICE=ON -DWITH_QT=ON -DWITH_FFMPEG=ON -DOPENCV_FFMPEG_SKIP_BUILD_CHECK=ON -DWITH_VA=ON -DWITH_VA_INTEL=ON -DWITH_1394=OFF -DWITH_ADE=OFF -DWITH_VTK=OFF -DWITH_EIGEN=OFF -DWITH_GTK=OFF -DWITH_GTK_2_X=OFF -DWITH_IPP=OFF -DWITH_JASPER=OFF -DWITH_WEBP=OFF -DWITH_OPENEXR=OFF -DWITH_OPENVX=OFF -DWITH_OPENNI=OFF -DWITH_OPENNI2=OFF-DWITH_TBB=OFF -DWITH_TIFF=OFF -DWITH_OPENCL=ON -DWITH_OPENCL_SVM=ON -DWITH_OPENCLAMDFFT=OFF -DWITH_OPENCLAMDBLAS=OFF -DWITH_GPHOTO2=OFF -DWITH_LAPACK=OFF -DWITH_ITT=OFF -DWITH_QUIRC=ON -DBUILD_ZLIB=OFF -DBUILD_opencv_apps=OFF -DBUILD_opencv_calib3d=OFF -DBUIlD_opencv_ccalib=OFF -DBUILD_opencv_dnn=OFF -DBUILD_opencv_features2d=OFF -DBUILD_opencv_flann=OFF -DBUILD_opencv_gapi=OFF -DBUILD_opencv_ml=OFF -DBUILD_opencv_photo=OFF -DBUILD_opencv_imgcodecs=ON -DBUILD_opencv_shape=OFF -DBUILD_opencv_videoio=ON -DBUILD_opencv_videostab=OFF -DBUILD_opencv_highgui=OFF -DBUILD_opencv_superres=OFF -DBUILD_opencv_stitching=OFF -DBUILD_opencv_java=OFF -DBUILD_opencv_js=OFF -DBUILD_opencv_python2=OFF -DBUILD_opencv_python3=OFF -DBUILD_opencv_alphamat=OFF -DBUILD_opencv_aruco=OFF -DBUILD_opencv_barcode=OFF -DBUILD_opencv_bgsegm=OFF -DBUILD_opencv_bioinspired=OFF -DBUILD_opencv_ccalib=OFF -DBUILD_opencv_cnn_3dobj=OFF -DBUILD_opencv_cudaarithm=OFF -DBUILD_opencv_cudabgsegm=OFF -DBUILD_opencv_cudacodec=OFF -DBUILD_opencv_cudafeatures2d=OFF -DBUILD_opencv_cudafilters=OFF -DBUILD_opencv_cudaimgproc=OFF -DBUILD_opencv_cudalegacy=OFF -DBUILD_opencv_cudaobjdetect=OFF -DBUILD_opencv_cudaoptflow=OFF -DBUILD_opencv_cudastereo=OFF -DBUILD_opencv_cudawarping=OFF -DBUILD_opencv_cudev=OFF -DBUILD_opencv_cvv=OFF -DBUILD_opencv_datasets=OFF -DBUILD_opencv_dnn_objdetect=OFF -DBUILD_opencv_dnns_easily_fooled=OFF -DBUILD_opencv_dnn_superres=OFF -DBUILD_opencv_dpm=OFF -DBUILD_opencv_face=OFF -DBUILD_opencv_freetype=OFF -DBUILD_opencv_fuzzy=OFF -DBUILD_opencv_hdf=OFF -DBUILD_opencv_hfs=OFF -DBUILD_opencv_img_hash=OFF -DBUILD_opencv_intensity_transform=OFF -DBUILD_opencv_julia=OFF -DBUILD_opencv_line_descriptor=OFF -DBUILD_opencv_matlab=OFF -DBUILD_opencv_mcc=OFF -DBUILD_opencv_optflow=OFF -DBUILD_opencv_ovis=OFF -DBUILD_opencv_phase_unwrapping=OFF -DBUILD_opencv_plot=OFF -DBUILD_opencv_quality=OFF -DBUILD_opencv_rapid=OFF -DBUILD_opencv_README.md=OFF -DBUILD_opencv_reg=OFF -DBUILD_opencv_rgbd=OFF -DBUILD_opencv_saliency=OFF -DBUILD_opencv_sfm=OFF -DBUILD_opencv_shape=OFF -DBUILD_opencv_stereo=OFF -DBUILD_opencv_structured_light=OFF -DBUILD_opencv_superres=OFF -DBUILD_opencv_surface_matching=OFF -DBUILD_opencv_text=OFF -DBUILD_opencv_tracking=OFF -DBUILD_opencv_videostab=OFF -DBUILD_opencv_viz=OFF -DBUILD_opencv_wechat_qrcode=OFF -DBUILD_opencv_xfeatures2d=OFF -DBUILD_opencv_ximgproc=OFF -DBUILD_opencv_xobjdetect=OFF -DBUILD_opencv_xphoto=OFF -DBUILD_EXAMPLES=OFF -DBUILD_PACKAGE=OFF -DBUILD_TESTS=OFF -DBUILD_PERF_TESTS=OFF -DBUILD_DOCS=OFF -DWITH_PTHREADS_PF=ON -DCV_ENABLE_INTRINSICS=ON -DOPENCV_EXTRA_MODULES_PATH=../../V4D/modules/ .. +make -j8 +sudo make install +``` + +## OR: Full V4D build with examples, demos and debian packages (takes a while) + +```bash +git clone --branch GCV https://github.com/kallaballa/opencv.git +git clone https://github.com/kallaballa/V4D.git +mkdir opencv/build +cd opencv/build +cmake -DINSTALL_BIN_EXAMPLES=ON -DOPENCV_CUSTOM_PACKAGE_INFO=ON -DCPACK_PACKAGE_VERSION_MAJOR=4 -DCPACK_PACKAGE_VERSION_MINOR=8 -DCPACK_PACKAGE_VERSION_PATCH=0 -DCPACK_PACKAGE_VERSION=4:8.0-kallaballa -DCMAKE_BUILD_TYPE=Release -DCPACK_PACKAGE_CONTACT="amir@viel-zu.org" -DOPENCV_GENERATE_PKGCONFIG=ON -DCPACK_PACKAGE_VENDOR=kallaballa -DCPACK_DEBIAN_PACKAGE_DEPENDS="libqt5opengl5,freeglut3,ocl-icd-libopencl1,libavcodec58,libavdevice58,libavfilter7,libavformat58,libavutil56,libpostproc55,libswresample3,libswscale5,libglfw3,libstb0,libglew2.2,zlib1g,libxinerama1,libxcursor1,libxi6,libva2,intel-opencl-icd,ca-certificates" -DINSTALL_CREATE_DISTRIB=ON -DCPACK_BINARY_DEB=ON -DCV_TRACE=OFF -DBUILD_SHARED_LIBS=ON -DWITH_OPENGL=ON -DOPENCV_ENABLE_EGL=ON -DOPENCV_ENABLE_GLX=ON -DOPENCV_FFMPEG_ENABLE_LIBAVDEVICE=ON -DWITH_QT=ON -DWITH_FFMPEG=ON -DOPENCV_FFMPEG_SKIP_BUILD_CHECK=ON -DWITH_VA=ON -DWITH_VA_INTEL=ON -DWITH_1394=OFF -DWITH_ADE=OFF -DWITH_VTK=OFF -DWITH_EIGEN=OFF -DWITH_GTK=OFF -DWITH_GTK_2_X=OFF -DWITH_IPP=OFF -DWITH_JASPER=OFF -DWITH_WEBP=OFF -DWITH_OPENEXR=OFF -DWITH_OPENVX=OFF -DWITH_OPENNI=OFF -DWITH_OPENNI2=OFF-DWITH_TBB=OFF -DWITH_TIFF=OFF -DWITH_OPENCL=ON -DWITH_OPENCL_SVM=ON -DWITH_OPENCLAMDFFT=OFF -DWITH_OPENCLAMDBLAS=OFF -DWITH_GPHOTO2=OFF -DWITH_LAPACK=OFF -DWITH_ITT=OFF -DWITH_QUIRC=ON -DBUILD_ZLIB=OFF -DBUILD_opencv_apps=OFF -DBUILD_opencv_calib3d=ON -DBUIlD_opencv_ccalib=OFF -DBUILD_opencv_dnn=ON -DBUILD_opencv_features2d=ON -DBUILD_opencv_flann=ON -DBUILD_opencv_gapi=OFF -DBUILD_opencv_ml=OFF -DBUILD_opencv_photo=ON -DBUILD_opencv_imgcodecs=ON -DBUILD_opencv_shape=OFF -DBUILD_opencv_videoio=ON -DBUILD_opencv_videostab=OFF -DBUILD_opencv_highgui=ON -DBUILD_opencv_superres=OFF -DBUILD_opencv_stitching=ON -DBUILD_opencv_java=OFF -DBUILD_opencv_js=OFF -DBUILD_opencv_python2=OFF -DBUILD_opencv_python3=OFF -DBUILD_opencv_alphamat=OFF -DBUILD_opencv_aruco=OFF -DBUILD_opencv_barcode=OFF -DBUILD_opencv_bgsegm=OFF -DBUILD_opencv_bioinspired=OFF -DBUILD_opencv_ccalib=ON -DBUILD_opencv_cnn_3dobj=OFF -DBUILD_opencv_cudaarithm=OFF -DBUILD_opencv_cudabgsegm=OFF -DBUILD_opencv_cudacodec=OFF -DBUILD_opencv_cudafeatures2d=OFF -DBUILD_opencv_cudafilters=OFF -DBUILD_opencv_cudaimgproc=OFF -DBUILD_opencv_cudalegacy=OFF -DBUILD_opencv_cudaobjdetect=OFF -DBUILD_opencv_cudaoptflow=OFF -DBUILD_opencv_cudastereo=OFF -DBUILD_opencv_cudawarping=OFF -DBUILD_opencv_cudev=OFF -DBUILD_opencv_cvv=OFF -DBUILD_opencv_datasets=OFF -DBUILD_opencv_dnn_objdetect=OFF -DBUILD_opencv_dnns_easily_fooled=OFF -DBUILD_opencv_dnn_superres=OFF -DBUILD_opencv_dpm=OFF -DBUILD_opencv_face=ON -DBUILD_opencv_freetype=OFF -DBUILD_opencv_fuzzy=OFF -DBUILD_opencv_hdf=OFF -DBUILD_opencv_hfs=OFF -DBUILD_opencv_img_hash=OFF -DBUILD_opencv_intensity_transform=OFF -DBUILD_opencv_julia=OFF -DBUILD_opencv_line_descriptor=OFF -DBUILD_opencv_matlab=OFF -DBUILD_opencv_mcc=OFF -DBUILD_opencv_optflow=ON -DBUILD_opencv_ovis=OFF -DBUILD_opencv_phase_unwrapping=OFF -DBUILD_opencv_plot=ON -DBUILD_opencv_quality=OFF -DBUILD_opencv_rapid=OFF -DBUILD_opencv_README.md=OFF -DBUILD_opencv_reg=OFF -DBUILD_opencv_rgbd=OFF -DBUILD_opencv_saliency=OFF -DBUILD_opencv_sfm=OFF -DBUILD_opencv_shape=OFF -DBUILD_opencv_stereo=OFF -DBUILD_opencv_structured_light=OFF -DBUILD_opencv_superres=OFF -DBUILD_opencv_surface_matching=OFF -DBUILD_opencv_text=OFF -DBUILD_opencv_tracking=ON -DBUILD_opencv_videostab=OFF -DBUILD_opencv_viz=OFF -DBUILD_opencv_wechat_qrcode=OFF -DBUILD_opencv_xfeatures2d=OFF -DBUILD_opencv_ximgproc=ON -DBUILD_opencv_xobjdetect=OFF -DBUILD_opencv_xphoto=OFF -DBUILD_EXAMPLES=ON -DBUILD_PACKAGE=ON -DBUILD_TESTS=OFF -DBUILD_PERF_TESTS=OFF -DBUILD_DOCS=ON -DWITH_PTHREADS_PF=ON -DCV_ENABLE_INTRINSICS=ON -DOPENCV_EXTRA_MODULES_PATH=../../V4D/modules/ .. +make -j8 +sudo make install +``` +## Build debian packages +```bash +cpack DEB +``` + +## Download the example videos +```bash +# big buck bunny video +wget -O bunny.webm https://upload.wikimedia.org/wikipedia/commons/transcoded/f/f3/Big_Buck_Bunny_first_23_seconds_1080p.ogv/Big_Buck_Bunny_first_23_seconds_1080p.ogv.1080p.vp9.webm +# dance video +yt-dlp -o dance.webm "https://www.youtube.com/watch?v=yg6LZtNeO_8" +# kristen video +yt-dlp -o kristen.webm "https://www.youtube.com/watch?v=hUAT8Jm_dvw&t=11s" +``` + +## Run the examples and demos +``` +# Examples +bin/example_v4d_display_image +bin/example_v4d_display_image_fb +bin/example_v4d_vector_graphics +bin/example_v4d_vector_graphics_and_fb +bin/example_v4d_render_opengl +bin/example_v4d_font_rendering +bin/example_v4d_video_editing +bin/example_v4d_custom_source_and_sink +bin/example_v4d_font_with_gui + +# Demos +bin/example_v4d_cube-demo +bin/example_v4d_many_cubes-demo +bin/example_v4d_video-demo bunny.webm +bin/example_v4d_nanovg-demo bunny.webm +bin/example_v4d_shader-demo bunny.webm +bin/example_v4d_font-demo +bin/example_v4d_pedestrian-demo dance.webm +bin/example_v4d_optflow-demo dance.webm +bin/example_v4d_beauty-demo kristen.webm + +``` + +# Attribution +* The author of the bunny video is the **Blender Foundation** ([Original video](https://www.bigbuckbunny.org)). +* The author of the dance video is **GNI Dance Company** ([Original video](https://www.youtube.com/watch?v=yg6LZtNeO_8)). +* The author of the video used in the beauty-demo video is **Kristen Leanne** ([Original video](https://www.youtube.com/watch?v=hUAT8Jm_dvw&t=11s)). +* The author of cxxpool is **Copyright (c) 2022 Christian Blume**: ([LICENSE](https://github.com/bloomen/cxxpool/blob/master/LICENSE)) +* The author of the roboto font family is **Google Inc.** ([LICENSE](https://github.com/googlefonts/roboto/blob/main/LICENSE)) diff --git a/modules/v4d/tutorials/01-dislay_image.markdown b/modules/v4d/tutorials/01-dislay_image.markdown new file mode 100644 index 000000000..7b75df1a6 --- /dev/null +++ b/modules/v4d/tutorials/01-dislay_image.markdown @@ -0,0 +1,16 @@ +# Display an image using the video pipeline {#v4d_display_image_pipeline} + +@prev_tutorial{v4d} +@next_tutorial{v4d_display_image_fb} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Using the video pipeline +Actually there are several ways to display an image using V4D. The most convenient way is to use the video pipeline to feed an image to V4D. That has the advantage that the image is automatically resized (preserving aspect ratio) to framebuffer size and color converted (the framebuffer is BGRA while video frames are expected to be BGR). + +\htmlinclude "../samples/example_v4d_display_image.html" + +@include samples/display_image.cpp diff --git a/modules/v4d/tutorials/02-dislay_image_fb.markdown b/modules/v4d/tutorials/02-dislay_image_fb.markdown new file mode 100644 index 000000000..b41e891b4 --- /dev/null +++ b/modules/v4d/tutorials/02-dislay_image_fb.markdown @@ -0,0 +1,16 @@ +# Display an image using direct framebuffer access {#v4d_display_image_fb} + +@prev_tutorial{v4d_display_image_pipeline} +@next_tutorial{v4d_display_image_nvg} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Using direct framebuffer access +Instead of feeding to the video pipeline we can request the framebuffer in a ```fb``` context and copy the image to it. But first we must manually resize and color convert the image. + +\htmlinclude "../samples/example_v4d_display_image_fb.html" + +@include samples/display_image_fb.cpp diff --git a/modules/v4d/tutorials/03-vector_graphics.markdown b/modules/v4d/tutorials/03-vector_graphics.markdown new file mode 100644 index 000000000..dec9d087d --- /dev/null +++ b/modules/v4d/tutorials/03-vector_graphics.markdown @@ -0,0 +1,16 @@ +# Render vector graphics {#v4d_vector_graphics} + +@prev_tutorial{v4d_display_image_nvg} +@next_tutorial{v4d_vector_graphics_and_fb} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Vector graphics +Through the html5-canvas-like nvg-context sophisticated vector graphics rendering is possible. + +\htmlinclude "../samples/example_v4d_vector_graphics.html" + +@include samples/vector_graphics.cpp diff --git a/modules/v4d/tutorials/04-vector_graphics_and_fb.markdown b/modules/v4d/tutorials/04-vector_graphics_and_fb.markdown new file mode 100644 index 000000000..3fcf6f735 --- /dev/null +++ b/modules/v4d/tutorials/04-vector_graphics_and_fb.markdown @@ -0,0 +1,16 @@ +# Render vector graphics and manipulate the framebuffer {#v4d_vector_graphics_and_fb} + +@prev_tutorial{v4d_vector_graphics} +@next_tutorial{v4d_render_opengl} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Vector graphics and framebuffer manipulation +The framebuffer can be accessed directly to manipulate data created in other contexts. In this case vector graphics is rendered to the framebuffer through NanoVG and then blurred using an ```fb`` context. + +\htmlinclude "../samples/example_v4d_vector_graphics_and_fb.html" + +@include samples/vector_graphics_and_fb.cpp diff --git a/modules/v4d/tutorials/05-render_opengl.markdown b/modules/v4d/tutorials/05-render_opengl.markdown new file mode 100644 index 000000000..bd6bc7837 --- /dev/null +++ b/modules/v4d/tutorials/05-render_opengl.markdown @@ -0,0 +1,18 @@ +# OpenGL Rendering {#v4d_render_opengl} + +@prev_tutorial{v4d_vector_graphics_and_fb} +@next_tutorial{v4d_font_rendering} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Render a blue screen using OpenGL +This example simply paints the screen blue using OpenGL without shaders for brevity. One important detail of this example is that states are being preserved between invocations of a context type (in this case the ```gl``` context). + +\htmlinclude "../samples/example_v4d_render_opengl.html" + +@include samples/render_opengl.cpp + + diff --git a/modules/v4d/tutorials/06-font_rendering.markdown b/modules/v4d/tutorials/06-font_rendering.markdown new file mode 100644 index 000000000..6613a3a45 --- /dev/null +++ b/modules/v4d/tutorials/06-font_rendering.markdown @@ -0,0 +1,17 @@ +# Font rendering {#v4d_font_rendering} + +@prev_tutorial{v4d_render_opengl} +@next_tutorial{v4d_video_editing} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Render Hello World +Draws "Hello World" to the screen using NanoVG. Demos some basic font options. + +\htmlinclude "../samples/example_v4d_font_rendering.html" + +@include samples/font_rendering.cpp + diff --git a/modules/v4d/tutorials/07-video_editing.markdown b/modules/v4d/tutorials/07-video_editing.markdown new file mode 100644 index 000000000..c6a74eb69 --- /dev/null +++ b/modules/v4d/tutorials/07-video_editing.markdown @@ -0,0 +1,18 @@ +# Video editing {#v4d_video_editing} + +@prev_tutorial{v4d_font_rendering} +@next_tutorial{v4d_custom_source_and_sink} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Render text on top of a video +Through adding a source and a sink v4d becomes capable of video editing. Reads a video, renders text on top and writes the result. Note: Reading and writing of video-data is multi-threaded in the background for performance reasons. + +\htmlinclude "../samples/example_v4d_video_editing.html" + +@include samples/video_editing.cpp + + diff --git a/modules/v4d/tutorials/08-custom_source_and_sink.markdown b/modules/v4d/tutorials/08-custom_source_and_sink.markdown new file mode 100644 index 000000000..3811a7dcd --- /dev/null +++ b/modules/v4d/tutorials/08-custom_source_and_sink.markdown @@ -0,0 +1,19 @@ +# Custom Source and Sink {#v4d_custom_source_and_sink} + +@prev_tutorial{v4d_video_editing} +@next_tutorial{v4d_font_with_gui} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Reading and writing to V4D using custom sources and sinks +In the previous tutorial we used a default video source and a video sink to stream a video through V4D which can be manipulated using OpenGL, NanoVG or OpenCV. In this example we are creating a custom source that generates rainbow frames. For each time the source is invoked the frame is colored a slightly different color. Additionally the custom sink saves individual images instead of a video (only in native builds). + +\htmlinclude "../samples/example_v4d_custom_source_and_sink.html" + +@include samples/custom_source_and_sink.cpp + + + diff --git a/modules/v4d/tutorials/09-font_with_gui.markdown b/modules/v4d/tutorials/09-font_with_gui.markdown new file mode 100644 index 000000000..f631bad4c --- /dev/null +++ b/modules/v4d/tutorials/09-font_with_gui.markdown @@ -0,0 +1,17 @@ +# Form based GUI {#v4d_font_with_gui} + +@prev_tutorial{v4d_custom_source_and_sink} +@next_tutorial{v4d} +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Font rendering with form based GUI +Draws "Hello World" to the screen and let's you control the font size and color with a GUI. + +\htmlinclude "../samples/example_v4d_font_with_gui.html" + +@include samples/font_with_gui.cpp + + diff --git a/modules/v4d/tutorials/10-cube.markdown b/modules/v4d/tutorials/10-cube.markdown new file mode 100644 index 000000000..9b87fe9bf --- /dev/null +++ b/modules/v4d/tutorials/10-cube.markdown @@ -0,0 +1,16 @@ +# Cube-Demo {#v4d_cube} + +@prev_tutorial{v4d} +@next_tutorial{v4d_many_cubes} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Renders a rainbow cube on blueish background using OpenGL and applies a glow effect using OpenCV. + +\htmlinclude "../samples/example_v4d_cube-demo.html" + +@include samples/cube-demo.cpp + diff --git a/modules/v4d/tutorials/11-video.markdown b/modules/v4d/tutorials/11-video.markdown new file mode 100644 index 000000000..f663de226 --- /dev/null +++ b/modules/v4d/tutorials/11-video.markdown @@ -0,0 +1,17 @@ +# Video-Demo {#v4d_video} + +@prev_tutorial{v4d_many_cubes} +@next_tutorial{v4d_nanovg} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Renders a rainbow cube on top of a input-video using OpenGL and applies a glow effect using OpenCV. + +\htmlinclude "../samples/example_v4d_video-demo.html" + +@include samples/video-demo.cpp + + diff --git a/modules/v4d/tutorials/12-nanovg.markdown b/modules/v4d/tutorials/12-nanovg.markdown new file mode 100644 index 000000000..95a8a7a12 --- /dev/null +++ b/modules/v4d/tutorials/12-nanovg.markdown @@ -0,0 +1,15 @@ +# Nanovg-Demo {#v4d_nanovg} + +@prev_tutorial{v4d_video} +@next_tutorial{v4d_shader} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Renders a color wheel on top of an input-video using nanovg (OpenGL) and does colorspace conversions using OpenCV. + +\htmlinclude "../samples/example_v4d_nanovg-demo.html" + +@include samples/nanovg-demo.cpp diff --git a/modules/v4d/tutorials/13-shader.markdown b/modules/v4d/tutorials/13-shader.markdown new file mode 100644 index 000000000..6df8fa7fc --- /dev/null +++ b/modules/v4d/tutorials/13-shader.markdown @@ -0,0 +1,17 @@ +# Shader-Demo {#v4d_shader} + +@prev_tutorial{v4d_nanovg} +@next_tutorial{v4d_font} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Renders a mandelbrot fractal zoom. Uses shaders, OpenCV and video editing together. + +\htmlinclude "../samples/example_v4d_shader-demo.html" + +@include samples/shader-demo.cpp + + diff --git a/modules/v4d/tutorials/14-font.markdown b/modules/v4d/tutorials/14-font.markdown new file mode 100644 index 000000000..f3ffc8b91 --- /dev/null +++ b/modules/v4d/tutorials/14-font.markdown @@ -0,0 +1,16 @@ +# Font-Demo {#v4d_font} + +@prev_tutorial{v4d_shader} +@next_tutorial{v4d_pedestrian} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Renders a Star Wars like text crawl using nanovg (OpenGL) and uses OpenCV for a pseudo 3D effect. + +\htmlinclude "../samples/example_v4d_font-demo.html" + +@include samples/font-demo.cpp + diff --git a/modules/v4d/tutorials/15-pedestrian.markdown b/modules/v4d/tutorials/15-pedestrian.markdown new file mode 100644 index 000000000..50fe362dd --- /dev/null +++ b/modules/v4d/tutorials/15-pedestrian.markdown @@ -0,0 +1,17 @@ +# Pedestrian-Demo {#v4d_pedestrian} + +@prev_tutorial{v4d_font} +@next_tutorial{v4d_optflow} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Pedestrian detection using HOG with a linear SVM, non-maximal suppression and tracking using KCF. Uses nanovg for rendering (OpenGL), detects using a linear SVM (OpenCV), filters resuls using NMS and tracks using KCF. + +\htmlinclude "../samples/example_v4d_pedestrian-demo.html" + +@include samples/pedestrian-demo.cpp + + diff --git a/modules/v4d/tutorials/16-optflow.markdown b/modules/v4d/tutorials/16-optflow.markdown new file mode 100644 index 000000000..d2764f075 --- /dev/null +++ b/modules/v4d/tutorials/16-optflow.markdown @@ -0,0 +1,18 @@ +# Optflow-Demo {#v4d_optflow} + +@prev_tutorial{v4d_pedestrian} +@next_tutorial{v4d_beauty} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Optical flow visualization on top of a video. Uses background subtraction (OpenCV) to isolate areas with motion, detects features to track (OpenCV), calculates the optical flow (OpenCV), uses nanovg for rendering (OpenGL) and post-processes the video (OpenCV). + +\htmlinclude "../samples/example_v4d_optflow-demo.html" + +@include samples/optflow-demo.cpp + + + diff --git a/modules/v4d/tutorials/17-beauty.markdown b/modules/v4d/tutorials/17-beauty.markdown new file mode 100644 index 000000000..08194500c --- /dev/null +++ b/modules/v4d/tutorials/17-beauty.markdown @@ -0,0 +1,14 @@ +# Beauty-Demo {#v4d_beauty} + +@prev_tutorial{v4d_optflow} +@next_tutorial{v4d} +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Face beautification using face landmark detection (OpenCV), nanovg (OpenGL) for drawing masks and multi-band blending to put it all together. + +\htmlinclude "../samples/example_v4d_beauty-demo.html" + +@include samples/beauty-demo.cpp diff --git a/modules/v4d/tutorials/18-many-cubes.markdown b/modules/v4d/tutorials/18-many-cubes.markdown new file mode 100644 index 000000000..699dbef5b --- /dev/null +++ b/modules/v4d/tutorials/18-many-cubes.markdown @@ -0,0 +1,16 @@ +# Many_Cubes-Demo {#v4d_many_cubes} + +@prev_tutorial{v4d_cube} +@next_tutorial{v4d_video} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +Renders 10 rainbow cubes on blueish background using OpenGL and applies a glow effect using OpenCV. The special thing about this demo is the each cube is renderered in a different OpenGL-context with its independent OpenGL-state. That said, this for sure isn't the most efficient way to draw multiple copies but serves well to demonstrate how independent OpenGL contexts/states can be used. + +\htmlinclude "../samples/example_v4d_many_cubes-demo.html" + +@include samples/many_cubes-demo.cpp + diff --git a/modules/v4d/tutorials/19-dislay_image_nvg.markdown b/modules/v4d/tutorials/19-dislay_image_nvg.markdown new file mode 100644 index 000000000..88272ec4f --- /dev/null +++ b/modules/v4d/tutorials/19-dislay_image_nvg.markdown @@ -0,0 +1,16 @@ +# Display an image using NanoVG {#v4d_display_image_nvg} + +@prev_tutorial{v4d_display_image_fb} +@next_tutorial{v4d_vector_graphics} + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +## Using NanoVG to display images +Instead of feeding to the video pipeline or doing a direct framebuffer access we can use NanoVG to display an image. It is not as convinient as the other methods but it is very fast and flexible. + +\htmlinclude "../samples/example_v4d_display_image_nvg.html" + +@include samples/display_image_nvg.cpp diff --git a/modules/v4d/tutorials/20-wasm.markdown b/modules/v4d/tutorials/20-wasm.markdown new file mode 100644 index 000000000..1d8e053ac --- /dev/null +++ b/modules/v4d/tutorials/20-wasm.markdown @@ -0,0 +1,79 @@ +# WebAssembly Support {#v4d_webassembly_support} + +[TOC] + +| | | +| -: | :- | +| Original author | Amir Hassan (kallaballa) | +| Compatibility | OpenCV >= 4.7 | + +# What is WebAssembly? +It is possible to compile C++ (but also other languages) for the browser. The resulting binaries contain usually WebAssembly (WASM) which the browser knows to execute. + +# So what makes it special for OpenCV and V4D? +For OpenCV there has been the possibility to run code in the browser for a while using [OpenCV.js](https://docs.opencv.org/4.x/d0/d84/tutorial_js_usage.html). But OpenCV.js merely offers the OpenCV APIs and visualization and GUI has to be done by other means (e.g. HTML5 Canvas). That is where V4D steps in because it has been written with WebAssembly in mind and cleverly uses [OpenGL](https://en.wikipedia.org/wiki/OpenGL) in a fashion that translates well to [WebGL](https://en.wikipedia.org/wiki/WebGL). V4D enables you to write graphical OpenCV applications that run native as well as in the browser. + +# Dependencies +* [Emscripten](https://emscripten.org) +* [My OpenCV 4.x Fork](https://github.com/kallaballa/opencv/tree/GCV) +* [V4D](https://github.com/kallaballa/V4D) + +# Instructions for Ubuntu 22.04.2 LTS + +## Install required packages +``` +# Install basic packages +apt install cmake make git-core build-essential pkg-config python3 software-properties-common +``` + +## Optional: Install Firefox +In case you don't have a recent browser here are the instructions to get Firefox. + +``` +# Add Mozilla PPA +add-apt-repository ppa:mozillateam/ppa + +#Install Firefox +apt install firefox-esr +``` + +## Install emscripten +``` +# Get the emsdk repo +git clone https://github.com/emscripten-core/emsdk.git + +# Enter that directory +cd emsdk + + Download and install the latest SDK tools. +./emsdk install latest + +# Make the "latest" SDK "active" for the current user. (writes .emscripten file) +./emsdk activate latest + +# Activate PATH and other environment variables in the current terminal +source ./emsdk_env.sh + +# Leave the directory +cd .. +``` + +## Build V4D with my OpenCV fork and all examples and demos for the browser +``` +git clone --branch GCV https://github.com/kallaballa/opencv.git +git clone https://github.com/kallaballa/V4D.git +mkdir opencv/build +cd opencv/build +emcmake cmake -DOPENCV_FORCE_3RDPARTY_BUILD=OFF -DPYTHON_DEFAULT_EXECUTABLE=/usr/bin/python3 -DENABLE_PIC=FALSE -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_TOOLCHAIN_FILE='../../emsdk/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake' -DCPU_BASELINE='' -DCMAKE_INSTALL_PREFIX=/usr/local -DCPU_DISPATCH='' -DCV_TRACE=OFF -DBUILD_SHARED_LIBS=OFF -DWITH_OPENGL=ON -DWITH_1394=OFF -DWITH_ADE=OFF -DWITH_VTK=OFF -DWITH_EIGEN=OFF -DWITH_FFMPEG=OFF -DWITH_GSTREAMER=OFF -DWITH_GTK=OFF -DWITH_GTK_2_X=OFF -DWITH_IPP=OFF -DWITH_JASPER=OFF -DWITH_JPEG=OFF -DWITH_WEBP=OFF -DWITH_OPENEXR=OFF -DWITH_OPENVX=OFF -DWITH_OPENNI=OFF -DWITH_OPENNI2=OFF -DWITH_PNG=OFF -DWITH_TBB=OFF -DWITH_TIFF=OFF -DWITH_V4L=OFF -DWITH_OPENCL=OFF -DWITH_OPENCL_SVM=OFF -DWITH_OPENCLAMDFFT=OFF -DWITH_OPENCLAMDBLAS=OFF -DWITH_GPHOTO2=OFF -DWITH_LAPACK=OFF -DWITH_ITT=OFF -DWITH_QUIRC=ON -DBUILD_ZLIB=OFF -DBUILD_opencv_apps=OFF -DBUILD_opencv_calib3d=ON -DBUILD_opencv_dnn=ON -DBUILD_opencv_features2d=ON -DBUILD_opencv_flann=ON -DBUILD_opencv_gapi=OFF -DBUILD_opencv_ml=OFF -DBUILD_opencv_photo=ON -DBUILD_opencv_imgcodecs=ON -DBUILD_opencv_shape=OFF -DBUILD_opencv_videoio=ON -DBUILD_opencv_videostab=OFF -DBUILD_opencv_highgui=OFF -DBUILD_opencv_superres=OFF -DBUILD_opencv_stitching=ON -DBUILD_opencv_java=OFF -DBUILD_opencv_js=ON -DBUILD_opencv_python2=OFF -DBUILD_opencv_python3=OFF -DBUILD_opencv_alphamat=OFF -DBUILD_opencv_aruco=OFF -DBUILD_opencv_barcode=OFF -DBUILD_opencv_bgsegm=OFF -DBUILD_opencv_bioinspired=OFF -DBUILD_opencv_ccalib=ON -DBUILD_opencv_cnn_3dobj=OFF -DBUILD_opencv_cudaarithm=OFF -DBUILD_opencv_cudabgsegm=OFF -DBUILD_opencv_cudacodec=OFF -DBUILD_opencv_cudafeatures2d=OFF -DBUILD_opencv_cudafilters=OFF -DBUILD_opencv_cudaimgproc=OFF -DBUILD_opencv_cudalegacy=OFF -DBUILD_opencv_cudaobjdetect=OFF -DBUILD_opencv_cudaoptflow=OFF -DBUILD_opencv_cudastereo=OFF -DBUILD_opencv_cudawarping=OFF -DBUILD_opencv_cudev=OFF -DBUILD_opencv_cvv=OFF -DBUILD_opencv_datasets=OFF -DBUILD_opencv_dnn_objdetect=OFF -DBUILD_opencv_dnns_easily_fooled=OFF -DBUILD_opencv_dnn_superres=OFF -DBUILD_opencv_dpm=OFF -DBUILD_opencv_face=ON -DBUILD_opencv_freetype=OFF -DBUILD_opencv_fuzzy=OFF -DBUILD_opencv_hdf=OFF -DBUILD_opencv_hfs=OFF -DBUILD_opencv_img_hash=OFF -DBUILD_opencv_intensity_transform=OFF -DBUILD_opencv_julia=OFF -DBUILD_opencv_line_descriptor=OFF -DBUILD_opencv_matlab=OFF -DBUILD_opencv_mcc=OFF -DBUILD_opencv_optflow=ON -DBUILD_opencv_ovis=OFF -DBUILD_opencv_phase_unwrapping=OFF -DBUILD_opencv_plot=ON -DBUILD_opencv_quality=OFF -DBUILD_opencv_rapid=OFF -DBUILD_opencv_README.md=OFF -DBUILD_opencv_reg=OFF -DBUILD_opencv_rgbd=OFF -DBUILD_opencv_saliency=OFF -DBUILD_opencv_sfm=OFF -DBUILD_opencv_shape=OFF -DBUILD_opencv_stereo=OFF -DBUILD_opencv_structured_light=OFF -DBUILD_opencv_superres=OFF -DBUILD_opencv_surface_matching=OFF -DBUILD_opencv_text=OFF -DBUILD_opencv_tracking=ON -DBUILD_opencv_videostab=OFF -DBUILD_opencv_viz=OFF -DBUILD_opencv_wechat_qrcode=OFF -DBUILD_opencv_xfeatures2d=OFF -DBUILD_opencv_ximgproc=ON -DBUILD_opencv_xobjdetect=OFF -DBUILD_opencv_xphoto=OFF -DBUILD_EXAMPLES=ON -DBUILD_PACKAGE=OFF -DBUILD_TESTS=OFF -DBUILD_PERF_TESTS=OFF -DOPENCV_EXTRA_MODULES_PATH=../../V4D/modules/ -DBUILD_DOCS=OFF -DWITH_PTHREADS_PF=ON -DCV_ENABLE_INTRINSICS=ON -DBUILD_WASM_INTRIN_TESTS=OFF -DCMAKE_C_FLAGS="-s USE_PTHREADS=1 -s USE_ZLIB=1 -msimd128" -DCMAKE_CXX_FLAGS="-s USE_PTHREADS=1 -s PTHREAD_POOL_SIZE_STRICT=0 -s PTHREAD_POOL_SIZE=8 -s USE_ZLIB=1 -msimd128" -DCMAKE_LD_FLAGS="-s EXPORTED_RUNTIME_METHODS=['ccall','cwrap'] --bind -s MALLOC=emmalloc -s WASM_BIGINT=1 -s USE_GLFW=3 -s WASM=1 -s SINGLE_FILE=1 -s USE_PTHREADS=1 -s PTHREAD_POOL_SIZE_STRICT=0 -s PTHREAD_POOL_SIZE=8 -s USE_ZLIB=1 -msimd128" .. +make -j8 +sudo make -j8 install +``` + +## Run the examples and demos +Though the examples and demos are compiled and come with an html file to run them, you can't just do so. Certain WebAssembly features require [special HTTP headers](https://emscripten.org/docs/porting/pthreads.html?highlight=pthreads). Anyway, emscripten provides a tool ([emrun](https://emscripten.org/docs/compiling/Running-html-files-with-emrun.html)) that serves a web server configured just the right way. + +So to run cube-demo you have to do the following: + +``` +emrun --browser=firefox-esr bin/example_v4d_cube-demo.html +``` \ No newline at end of file