{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 通过 UMA 使您的硬件加速器 TVM-ready\n", "\n", "**Authors**: [Michael J. Klaiber](https://github.com/MichaelJKlaiber), [Christoph Gerum](https://github.com/cgerum),\n", "[Paul Palomero Bernardo](https://github.com/PaulPalomeroBernardo/)\n", "\n", "这是 **通用模块化加速器接口** (Universal Modular Accelerator Interface,简称 UMA)的入门教程。UMA 提供简单易用的 API,将新的硬件加速器集成到 TVM 中。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "本教程逐步指导您如何使用 UMA,使您的硬件加速器 TVM-ready。\n", "\n", "虽然这个问题没有一刀切的解决方案,但 UMA 的目标是提供一个稳定的、仅使用 Python 的 API,将许多硬件加速器类集成到 TVM 中。\n", "\n", "在本教程中,您将了解 UMA API 在三个日益复杂的用例中。在这些用例中,三个模拟加速器 **Vanilla**、**Strawberry** 和 **Chocolate** 被引入并使用 UMA 集成到 TVM 中。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Vanilla\n", "\n", "**Vanilla** 是由 MAC array 组成,没有内部内存的简单加速器。它只能处理 Conv2D 层,所有其他层都在 CPU 上执行,这也协调了 **Vanilla**。CPU 和 Vanilla 都使用共享内存。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Vanilla** 提供 C 接口 ``vanilla_conv2dnchw(...)`` 用于携带 Conv2D 运算(包括 same-padding),它接受指向输入 feature map、权重和结果的指针,以及的 `Conv2D` 的维度:`oc`, `iw`, `ih`, `ic`, `kh`, `kw`。\n", "\n", "```c++\n", "int vanilla_conv2dnchw(float* ifmap, float* weights, float* result, int oc, int iw, int ih, int ic, int kh, int kw);\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "脚本 `uma_cli` 为新的加速器创建带有 API-calls 的代码骨架。\n", "\n", "对于 **Vanilla**,我们使用如下方法: (``--tutorial vanilla`` 添加教程这部分所需的所有附加文件)\n", "\n", "```bash\n", "pip install inflection\n", "cd $TVM_HOME/apps/uma\n", "python uma_cli.py --add_hardware vanilla_accelerator --tutorial vanilla\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`uma_cli.py` 将在 ``vanilla_accelerator`` 目录中生成这些文件,我们将重新访问这个目录。\n", "\n", "```bash\n", "backend.py\n", "codegen.py\n", "conv2dnchw.cc\n", "passes.py\n", "patterns.py\n", "run.py\n", "strategies.py\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Vanilla 后端\n", "\n", "为 vanilla 生成的后端可以在 `vanilla_accelerator/backend.py` 中找到:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "class VanillaAcceleratorBackend(UMABackend):\n", " \"\"\"UMA backend for VanillaAccelerator.\"\"\"\n", "\n", " def __init__(self):\n", " super().__init__()\n", "\n", " self._register_pattern(\"conv2d\", conv2d_pattern())\n", " self._register_tir_pass(PassPhase.TIR_PHASE_0, VanillaAcceleratorConv2DPass())\n", " self._register_codegen(fmt=\"c\", includes=gen_includes)\n", "\n", " @property\n", " def target_name(self):\n", " return \"vanilla_accelerator\"\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 定义卸载模式\n", "\n", "为了指定将 `Conv2D` 卸载(offloaded)到 **Vanilla**,它在 `vanilla_accelerator/patterns.py` 中被描述为 Relay 数据流模式([DFPattern](https://tvm.apache.org/docs/reference/langref/relay_pattern.html))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "def conv2d_pattern():\n", " pattern = is_op(\"nn.conv2d\")(wildcard(), wildcard())\n", " pattern = pattern.has_attr({\"strides\": [1, 1]})\n", " return pattern\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "为了将 `Conv2D` 运算从 input graph 映射到 **Vanilla** 的低级函数调用 ``vanilla_conv2dnchw(...)``,TIR pass `VanillaAcceleratorConv2DPass` (将在本教程后面讨论)在 `VanillaAcceleratorBackend` 中注册。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Codegen" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "文件 ``vanilla_accelerator/codegen.py`` 定义了静态 C 代码,它被添加到由 TVM C-Codegen 在 ``gen_includes`` 中生成的 C 代码。\n", "\n", "这里添加了 C 代码,以包含 **Vanilla** 的低级库 ``vanilla_conv2dnchw()``。\n", "\n", "```python\n", "def gen_includes() -> str:\n", " topdir = pathlib.Path(__file__).parent.absolute()\n", "\n", " includes = \"\"\n", " includes += f'#include \"{topdir}/conv2dnchw.cc\"'\n", " return includes\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "如上所示,在 `VanillaAcceleratorBackend` 中,它通过 `self._register_codegen` 注册到 UMA\n", "\n", "```python\n", "self._register_codegen(fmt=\"c\", includes=gen_includes)\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 建立神经网络,并在 Vanilla 上运行\n", "\n", "为了演示 UMA 的功能,我们将为单个 `Conv2D` 层生成 C 代码,并在 Vanilla 加速器上运行它。文件 ``vanilla_accelerator/run.py`` 提供了一个使用 Vanilla 的 C-API 运行 Conv2D 层的演示。\n", "\n", "\n", "```python\n", "def main():\n", " mod, inputs, output_list, runner = create_conv2d()\n", "\n", " uma_backend = VanillaAcceleratorBackend()\n", " uma_backend.register()\n", " mod = uma_backend.partition(mod)\n", " target = tvm.target.Target(\"vanilla_accelerator\", host=tvm.target.Target(\"c\"))\n", "\n", " export_directory = tvm.contrib.utils.tempdir(keep_for_debug=True).path\n", " print(f\"Generated files are in {export_directory}\")\n", " compile_and_run(\n", " AOTModel(module=mod, inputs=inputs, outputs=output_list),\n", " runner,\n", " interface_api=\"c\",\n", " use_unpacked_api=True,\n", " target=target,\n", " test_dir=str(export_directory),\n", " )\n", "\n", "\n", "main()\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "通过运行 ``vanilla_accelerator/run.py``,输出文件将以模型库格式(model library format,简称 MLF)生成。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Output\n", "\n", "```bash\n", "Generated files are in /tmp/tvm-debug-mode-tempdirs/2022-07-13T13-26-22___x5u76h0p/00000\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "让我们检查一下生成的文件:\n", "\n", "Output:\n", "\n", "```bash\n", "cd /tmp/tvm-debug-mode-tempdirs/2022-07-13T13-26-22___x5u76h0p/00000\n", "cd build/\n", "ls -1\n", "\n", "codegen\n", "lib.tar\n", "metadata.json\n", "parameters\n", "runtime\n", "src\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "要 evaluate 生成的 C 代码,请访问 ``codegen/host/src/default_lib2.c``:\n", "\n", "```bash\n", "cd codegen/host/src/\n", "ls -1\n", "\n", "default_lib0.c\n", "default_lib1.c\n", "default_lib2.c\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在 `default_lib2.c` 中,你现在可以看到生成的代码调用了 Vanilla 的 C-API 并执行了 Conv2D 层:\n", "\n", "```c++\n", "TVM_DLL int32_t tvmgen_default_vanilla_accelerator_main_0(float* placeholder, float* placeholder1, float* conv2d_nchw, uint8_t* global_workspace_1_var) {\n", " vanilla_accelerator_conv2dnchw(placeholder, placeholder1, conv2d_nchw, 32, 14, 14, 32, 3, 3);\n", " return 0;\n", "}\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strawberry\n", "Coming soon ...\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Chocolate\n", "Coming soon ...\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Request for Community Input\n", "\n", "If this tutorial **did not** fit to your accelerator, lease add your requirements to the UMA thread in\n", "the TVM discuss forum: [Link](https://discuss.tvm.apache.org/t/rfc-uma-universal-modular-accelerator-interface/12039).\n", "We are eager to extend this tutorial to provide guidance on making further classes of AI hardware\n", "accelerators TVM-ready using the UMA interface." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "[UMA-RFC] [UMA: Universal Modular Accelerator Interface](https://github.com/apache/tvm-rfcs/blob/main/rfcs/0060_UMA_Unified_Modular_Accelerator_Interface.md),\n", "TVM RFC, June 2022.\n", "\n", "[DFPattern] [Pattern Matching in Relay](https://tvm.apache.org/docs/reference/langref/relay_pattern.html)\n", "\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.13 ('py38': conda)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "vscode": { "interpreter": { "hash": "28558e8daad512806f5c536a1a04c119185f99f65b79002708a12162d02a79c7" } } }, "nbformat": 4, "nbformat_minor": 0 }