{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 使用 TEDD 进行可视化\n",
"\n",
"**原作者**: [Yongfeng Gu](https://github.com/yongfeng-nv)\n",
"\n",
"这是关于使用 TEDD (Tensor Expression Debug Display) 可视化张量表达式的介绍。\n",
"\n",
"张量表达式使用原语(primitive)调度。尽管单个原语通常很容易理解,但当您将它们放在一起时,它们很快就变得复杂了。在张量表达式中引入了调度原语的运算模型 (operational model)。\n",
"\n",
"- 不同调度原语之间的交互,\n",
"- 调度原语对最终代码生成(code generation)的影响。\n",
"\n",
"运算模型基于数据流图(Dataflow Graph)、调度树(Schedule Tree)和迭代关系图(IterVar Relationship Graph)。调度原语对这些图执行运算。\n",
"\n",
"TEDD 根据给定的调度呈现这三个图。本教程演示了如何使用 TEDD 以及如何解释呈现的图。"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import tvm\n",
"from tvm import te\n",
"from tvm import topi\n",
"from tvm.contrib import tedd\n",
"from IPython.display import display_svg"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 定义和调度带有 bias 和 ReLU 的卷积\n",
"\n",
"建立包含 Bias 和 ReLU 的卷积张量表达式的例子。首先连接 conv2d、add 和 relu TOPIs。然后,创建 TOPI 通用调度。"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"batch = 1\n",
"in_channel = 256\n",
"in_size = 32\n",
"num_filter = 256\n",
"kernel = 3\n",
"stride = 1\n",
"padding = \"SAME\"\n",
"dilation = 1\n",
"\n",
"A = te.placeholder((in_size, in_size, in_channel, batch), name=\"A\")\n",
"W = te.placeholder((kernel, kernel, in_channel, num_filter), name=\"W\")\n",
"B = te.placeholder((1, num_filter, 1), name=\"bias\")\n",
"\n",
"with tvm.target.Target(\"llvm\"):\n",
" t_conv = topi.nn.conv2d_hwcn(A, W, stride, padding, dilation)\n",
" t_bias = topi.add(t_conv, B)\n",
" t_relu = topi.nn.relu(t_bias)\n",
" s = topi.generic.schedule_conv2d_hwcn([t_relu])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 使用 TEDD 渲染 Graph\n",
"\n",
"渲染图来查看计算过程以及如何调度它。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/svg+xml": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"graph = tedd.viz_dataflow_graph(s, show_svg=True)\n",
"display_svg(graph)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"也可保存图到本地:\n",
"\n",
"```python\n",
"tedd.viz_dataflow_graph(s, dot_file_path=\"/tmp/dfg.dot\")\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"第一个是数据流图。每个节点表示一个阶段,中间显示名称和 memory scope,两侧显示 inputs/outputs 信息。边表示节点的依赖关系。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/svg+xml": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"tree = tedd.viz_schedule_tree(s, show_svg=True)\n",
"# tedd.viz_schedule_tree(s, dot_file_path=\"/tmp/scheduletree.dot\")\n",
"display_svg(tree)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"刚刚渲染了调度树图。您可能会注意到关于范围不可用的警告。\n",
"\n",
"该消息还建议调用 `normalize()` 来推断范围信息。鼓励您比较 `normalize()` 前后的图表,以了解其影响。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/svg+xml": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"s = s.normalize()\n",
"# tedd.viz_schedule_tree(s, dot_file_path=\"/tmp/scheduletree2.dot\")\n",
"tree = tedd.viz_schedule_tree(s, show_svg = True)\n",
"display_svg(tree)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"现在,仔细看看第二个调度树。ROOT 下的每个块代表一个阶段。阶段名称显示在顶部行,计算显示在底部行。中间的行是 IterVars,外部越高,内部越低。\n",
"\n",
"IterVar 行包含它的索引、名称、类型和其他可选信息。以 W.shared 为例。第一行告诉它的名称 \"W.shared\" 和内存作用域 \"Shared\"。它的计算是: `W(ax0, ax1, ax2, ax3)`。它最外层的循环 IterVar 是 ax0.ax1.fused.ax2.fused.ax3.fused.outer,kDataPar 的索引为 0,绑定到 threadIdx.y 和 range(min=0, ext=8)。\n",
"\n",
"您还可以使用图例中所示的索引框颜色来告诉 IterVar 类型。\n",
"\n",
"如果一个阶段没有 compute_at 的任何其他阶段,它就有一条直接到 ROOT 节点的边。否则,它有一条边指向它所附加的 IterVar,例如 W.shared 附加到 rx.outer 中间计算阶段。\n",
"\n",
"```{note}\n",
":class: alert alert-info\n",
"\n",
"根据定义,itervar 是内部节点,计算是调度树的叶子节点。省略了 IterVars 之间的边和阶段内的计算,使每个阶段成为块,以提高可读性。\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/svg+xml": "\n\n\n\n\n"
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from graphviz import Source\n",
"# tedd.viz_itervar_relationship_graph(s, dot_file_path=\"/tmp/itervar.dot\")\n",
"dot_string = tedd.viz_itervar_relationship_graph(s, output_dot_string=True)\n",
"src = Source(dot_string)\n",
"display_svg(src)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"最后一个是迭代关系图(IterVar Relationship Graph)。每个子图表示一个阶段,并包含 IterVar 节点和变换节点。例如,W.shared 有三个 split 节点和三个 fuse 节点。其余的是 IterVar 节点,其格式与 Schedule Trees 中的 IterVar 行相同。Root itervar 是那些不受任何变换节点驱动的迭代器,例如 ax0; 叶 IterVars 不驱动任何变换节点,并且具有非负索引,如 ax0.ax1.fused.ax2.fused.ax3.fused.outer 索引为 0。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 小结\n",
"\n",
"本教程演示了 TEDD 的用法。使用一个用 TOPI 构建的示例来显示底层的调度。您还可以在任何调度原语之前和之后使用它来检查其效果。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.10.4 ('mxnetx')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
},
"vscode": {
"interpreter": {
"hash": "aa67ff675248b5ab29dcd2f00c1422844307085c8ca7c8ce7eddecd21b9c2975"
}
}
},
"nbformat": 4,
"nbformat_minor": 0
}