{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n\n# Deploy the Pretrained Model on Jetson Nano\n**Author**: [BBuf](https://github.com/BBuf)\n\nThis is an example of using Relay to compile a ResNet model and deploy\nit on Jetson Nano.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import tvm\nfrom tvm import te\nimport tvm.relay as relay\nfrom tvm import rpc\nfrom tvm.contrib import utils, graph_executor as runtime\nfrom tvm.contrib.download import download_testdata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n## Build TVM Runtime on Jetson Nano\n\nThe first step is to build the TVM runtime on the remote device.\n\n

Note

All instructions in both this section and next section should be\n executed on the target device, e.g. Jetson Nano. And we assume it\n has Linux running.

\n\nSince we do compilation on local machine, the remote device is only used\nfor running the generated code. We only need to build tvm runtime on\nthe remote device.\n\n```bash\ngit clone --recursive https://github.com/apache/tvm tvm\ncd tvm\nmkdir build\ncp cmake/config.cmake build\ncd build\ncmake ..\nmake runtime -j4\n```\n

Note

If we want to use Jetson Nano's GPU for inference,\n we need to enable the CUDA option in `config.cmake`,\n that is, `set(USE_CUDA ON)`

\n\nAfter building runtime successfully, we need to set environment varibles\nin :code:`~/.bashrc` file. We can edit :code:`~/.bashrc`\nusing :code:`vi ~/.bashrc` and add the line below (Assuming your TVM\ndirectory is in :code:`~/tvm`):\n\n```bash\nexport PYTHONPATH=$PYTHONPATH:~/tvm/python\n```\nTo update the environment variables, execute :code:`source ~/.bashrc`.\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set Up RPC Server on Device\nTo start an RPC server, run the following command on your remote device\n(Which is Jetson Nano in our example).\n\n```bash\npython -m tvm.exec.rpc_server --host 0.0.0.0 --port=9091\n```\nIf you see the line below, it means the RPC server started\nsuccessfully on your device.\n\n```bash\nINFO:RPCServer:bind to 0.0.0.0:9091\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare the Pre-trained Model\nBack to the host machine, which should have a full TVM installed (with LLVM).\n\nWe will use pre-trained model from\n[MXNet Gluon model zoo](https://mxnet.apache.org/api/python/gluon/model_zoo.html).\nYou can found more details about this part at tutorial `tutorial-from-mxnet`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from mxnet.gluon.model_zoo.vision import get_model\nfrom PIL import Image\nimport numpy as np\n\n# one line to get the model\nblock = get_model(\"resnet18_v1\", pretrained=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to test our model, here we download an image of cat and\ntransform its format.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "img_url = \"https://github.com/dmlc/mxnet.js/blob/main/data/cat.png?raw=true\"\nimg_name = \"cat.png\"\nimg_path = download_testdata(img_url, img_name, module=\"data\")\nimage = Image.open(img_path).resize((224, 224))\n\n\ndef transform_image(image):\n image = np.array(image) - np.array([123.0, 117.0, 104.0])\n image /= np.array([58.395, 57.12, 57.375])\n image = image.transpose((2, 0, 1))\n image = image[np.newaxis, :]\n return image\n\n\nx = transform_image(image)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "synset is used to transform the label from number of ImageNet class to\nthe word human can understand.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "synset_url = \"\".join(\n [\n \"https://gist.githubusercontent.com/zhreshold/\",\n \"4d0b62f3d01426887599d4f7ede23ee5/raw/\",\n \"596b27d23537e5a1b5751d2b0481ef172f58b539/\",\n \"imagenet1000_clsid_to_human.txt\",\n ]\n)\nsynset_name = \"imagenet1000_clsid_to_human.txt\"\nsynset_path = download_testdata(synset_url, synset_name, module=\"data\")\nwith open(synset_path) as f:\n synset = eval(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we would like to port the Gluon model to a portable computational graph.\nIt's as easy as several lines.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# We support MXNet static graph(symbol) and HybridBlock in mxnet.gluon\nshape_dict = {\"data\": x.shape}\nmod, params = relay.frontend.from_mxnet(block, shape_dict)\n# we want a probability so add a softmax operator\nfunc = mod[\"main\"]\nfunc = relay.Function(func.params, relay.nn.softmax(func.body), None, func.type_params, func.attrs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are some basic data workload configurations.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "batch_size = 1\nnum_classes = 1000\nimage_shape = (3, 224, 224)\ndata_shape = (batch_size,) + image_shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compile The Graph\nTo compile the graph, we call the :py:func:`relay.build` function\nwith the graph configuration and parameters. However, You cannot to\ndeploy a x86 program on a device with ARM instruction set. It means\nRelay also needs to know the compilation option of target device,\napart from arguments :code:`net` and :code:`params` to specify the\ndeep learning workload. Actually, the option matters, different option\nwill lead to very different performance.\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we run the example on our x86 server for demonstration, we can simply\nset it as :code:`llvm`. If running it on the Jetson Nano, we need to\nset it as :code:`nvidia/jetson-nano`. Set :code:`local_demo` to False\nif you want to run this tutorial with a real device.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "local_demo = True\n\nif local_demo:\n target = tvm.target.Target(\"llvm\")\nelse:\n target = tvm.target.Target(\"nvidia/jetson-nano\")\n assert target.kind.name == \"cuda\"\n assert target.attrs[\"arch\"] == \"sm_53\"\n assert target.attrs[\"shared_memory_per_block\"] == 49152\n assert target.attrs[\"max_threads_per_block\"] == 1024\n assert target.attrs[\"thread_warp_size\"] == 32\n assert target.attrs[\"registers_per_block\"] == 32768\n\nwith tvm.transform.PassContext(opt_level=3):\n lib = relay.build(func, target, params=params)\n\n# After `relay.build`, you will get three return values: graph,\n# library and the new parameter, since we do some optimization that will\n# change the parameters but keep the result of model as the same.\n\n# Save the library at local temporary directory.\ntmp = utils.tempdir()\nlib_fname = tmp.relpath(\"net.tar\")\nlib.export_library(lib_fname)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy the Model Remotely by RPC\nWith RPC, you can deploy the model remotely from your host machine\nto the remote device.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# obtain an RPC session from remote device.\nif local_demo:\n remote = rpc.LocalSession()\nelse:\n # The following is my environment, change this to the IP address of your target device\n host = \"192.168.1.11\"\n port = 9091\n remote = rpc.connect(host, port)\n\n# upload the library to remote device and load it\nremote.upload(lib_fname)\nrlib = remote.load_module(\"net.tar\")\n\n# create the remote runtime module\nif local_demo:\n dev = remote.cpu(0)\nelse:\n dev = remote.cuda(0)\n\nmodule = runtime.GraphModule(rlib[\"default\"](dev))\n# set input data\nmodule.set_input(\"data\", tvm.nd.array(x.astype(\"float32\")))\n# run\nmodule.run()\n# get output\nout = module.get_output(0)\n# get top1 result\ntop1 = np.argmax(out.numpy())\nprint(\"TVM prediction top-1: {}\".format(synset[top1]))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 0 }