Model 库格式
导航
Model 库格式#
关于 Model 库格式#
传统上,TVM 将生成的库导出为动态共享对象(Dynamic Shared Objects,如 dll (Windows)或 .so (linux))。
通过使用 libtvm_runtime.so
将这些库加载到可执行文件中,可以使用这些库执行推断。这个过程对传统操作系统提供的服务有很大的依赖。
对于部署到非传统平台(例如那些缺乏传统操作系统),TVM 提供了另一种输出格式,模型库格式(Model Library Format)。最初,microTVM 项目是这种格式的主要用例。 如果它在其他用例中变得有用(特别是,如果可以模型库格式导出 BYOC 工件),它可以用作通用的 TVM 导出格式。模型库格式是 tarball,包含 TVM 编译器输出的每个部分的文件。
可以输出什么?#
在撰写本文时,导出仅限于使用 tvm.relay.build
构建的完整模型。
直接布局#
Model Library Format 包含在 tarball 中。所有路径都相对于 tarball 的根目录:
/
- Root of the tarballcodegen
- Root directory for all generated device code(see codegen section)
executor-config/
- Configuration for the executor which drives model inferencegraph/
- Root directory containing configuration for the GraphExecutorgraph.json
- GraphExecutor JSON configuration
metadata.json
- Machine-parseable metadata for this modelparameters/
- Root directory where simplified parameters are placed<model_name>.params
- Parameters for the model tvm.relay._save_params format
src/
- Root directory for all source code consumed by TVMrelay.txt
- Relay source code for the generated model
子目录的描述#
codegen
#
All TVM-generated code is placed in this directory. At the time of writing, there is 1 file per
Module in the generated Module tree, though this restriction may change in the future. Files in
this directory should have filenames of the form <target>/(lib|src)/<unique_name>.<format>
.
These components are described below:
<target>
- Identifies the TVM target on which the code should run. Currently, onlyhost
is supported.
<unique_name>
- A unique slug identifying this file. Currentlylib<n>
, with<n>>
an auto-incrementing integer.
<format>
- Suffix identifying the filename format. Currentlyc
oro
.
An example directory tree for a CPU-only model is shown below:
codegen/
- Codegen directoryhost/
- Generated code fortarget_host
lib/
- Generated binary object files
lib0.o
- LLVM module (ifllvm
target is used)lib1.o
- LLVM CRT Metadata Module (ifllvm
target is used)
src/
- Generated C sourcelib0.c
- C module (ifc
target is used)lib1.c
- C CRT Metadata module (ifc
target is used)
executor-config
#
Contains machine-parsable configuration for executors which can drive model inference. Currently,
only the GraphExecutor produces configuration for this directory, in graph/graph.json
. This
file should be read in and the resulting string supplied to the GraphExecutor()
constructor for
parsing.
parameters
#
Contains machine-parseable parameters. A variety of formats may be provided, but at present, only
the format produced by tvm.relay._save_params
is supplied. When building with
tvm.relay.build
, the name
parameter is considered to be the model name. A single file is
created in this directory <model_name>.json
.
src
#
Contains source code parsed by TVM. Currently, just the Relay source code is created in
src/relay.txt
.
Metadata#
Machine-parseable metadata is placed in a file metadata.json
at the root of the tarball.
Metadata is a dictionary with these keys:
export_datetime
: Timestamp when this Model Library Format was generated, in strftime format"%Y-%M-%d %H:%M:%SZ",
.memory
: A summary of the memory usage of each generated function. Documented in Memory Usage Summary.model_name
: The name of this model (e.g. thename
parameter supplied totvm.relay.build
).executors
: A list of executors supported by this model. Currently, this list is always["graph"]
.target
: A dictionary mappingdevice_type
(the underlying integer, as a string) to the sub-target which describes that relay backend used for thatdevice_type
.version
: A numeric version number that identifies the format used in this Model Library Format. This number is incremented when the metadata structure or on-disk structure changes. This document reflects version5
.
Memory Usage Summary#
A dictionary with these sub-keys:
"main"
:list[MainFunctionWorkspaceUsage]
. A list summarizing memory usage for each workspace used by the main function and all sub-functions invoked.
"operator_functions"
:map[string, list[FunctionWorkspaceUsage]]
. Maps operator function name to a list summarizing memory usage for each workpace used by the function.
A MainFunctionWorkspaceUsage
is a dict with these keys:
"device"
:int
. Thedevice_type
associated with this workspace."workspace_size_bytes"
:int
. Number of bytes needed in this workspace by this function and all sub-functions invoked."constants_size_bytes"
:int
. Size of the constants used by the main function."io_size_bytes"
:int
. Sum of the sizes of the buffers used from this workspace by this function and sub-functions.
A FunctionWorkspaceUsage
is a dict with these keys:
"device"
:int
. Thedevice_type
associated with this workspace."workspace_size_bytes"
:int
. Number of bytes needed in this workspace by this function.