{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n\n# Deploy the Pretrained Model on Jetson Nano\n**Author**: [BBuf](https://github.com/BBuf)\n\nThis is an example of using Relay to compile a ResNet model and deploy\nit on Jetson Nano.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import tvm\nfrom tvm import te\nimport tvm.relay as relay\nfrom tvm import rpc\nfrom tvm.contrib import utils, graph_executor as runtime\nfrom tvm.contrib.download import download_testdata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n## Build TVM Runtime on Jetson Nano\n\nThe first step is to build the TVM runtime on the remote device.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>All instructions in both this section and next section should be\n executed on the target device, e.g. Jetson Nano. And we assume it\n has Linux running.</p></div>\n\nSince we do compilation on local machine, the remote device is only used\nfor running the generated code. We only need to build tvm runtime on\nthe remote device.\n\n```bash\ngit clone --recursive https://github.com/apache/tvm tvm\ncd tvm\nmkdir build\ncp cmake/config.cmake build\ncd build\ncmake ..\nmake runtime -j4\n```\n<div class=\"alert alert-info\"><h4>Note</h4><p>If we want to use Jetson Nano's GPU for inference,\n we need to enable the CUDA option in `config.cmake`,\n that is, `set(USE_CUDA ON)`</p></div>\n\nAfter building runtime successfully, we need to set environment varibles\nin :code:`~/.bashrc` file. We can edit :code:`~/.bashrc`\nusing :code:`vi ~/.bashrc` and add the line below (Assuming your TVM\ndirectory is in :code:`~/tvm`):\n\n```bash\nexport PYTHONPATH=$PYTHONPATH:~/tvm/python\n```\nTo update the environment variables, execute :code:`source ~/.bashrc`.\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set Up RPC Server on Device\nTo start an RPC server, run the following command on your remote device\n(Which is Jetson Nano in our example).\n\n```bash\npython -m tvm.exec.rpc_server --host 0.0.0.0 --port=9091\n```\nIf you see the line below, it means the RPC server started\nsuccessfully on your device.\n\n```bash\nINFO:RPCServer:bind to 0.0.0.0:9091\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare the Pre-trained Model\nBack to the host machine, which should have a full TVM installed (with LLVM).\n\nWe will use pre-trained model from\n[MXNet Gluon model zoo](https://mxnet.apache.org/api/python/gluon/model_zoo.html).\nYou can found more details about this part at tutorial `tutorial-from-mxnet`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from mxnet.gluon.model_zoo.vision import get_model\nfrom PIL import Image\nimport numpy as np\n\n# one line to get the model\nblock = get_model(\"resnet18_v1\", pretrained=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to test our model, here we download an image of cat and\ntransform its format.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "img_url = \"https://github.com/dmlc/mxnet.js/blob/main/data/cat.png?raw=true\"\nimg_name = \"cat.png\"\nimg_path = download_testdata(img_url, img_name, module=\"data\")\nimage = Image.open(img_path).resize((224, 224))\n\n\ndef transform_image(image):\n image = np.array(image) - np.array([123.0, 117.0, 104.0])\n image /= np.array([58.395, 57.12, 57.375])\n image = image.transpose((2, 0, 1))\n image = image[np.newaxis, :]\n return image\n\n\nx = transform_image(image)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "synset is used to transform the label from number of ImageNet class to\nthe word human can understand.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "synset_url = \"\".join(\n [\n \"https://gist.githubusercontent.com/zhreshold/\",\n \"4d0b62f3d01426887599d4f7ede23ee5/raw/\",\n \"596b27d23537e5a1b5751d2b0481ef172f58b539/\",\n \"imagenet1000_clsid_to_human.txt\",\n ]\n)\nsynset_name = \"imagenet1000_clsid_to_human.txt\"\nsynset_path = download_testdata(synset_url, synset_name, module=\"data\")\nwith open(synset_path) as f:\n synset = eval(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we would like to port the Gluon model to a portable computational graph.\nIt's as easy as several lines.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# We support MXNet static graph(symbol) and HybridBlock in mxnet.gluon\nshape_dict = {\"data\": x.shape}\nmod, params = relay.frontend.from_mxnet(block, shape_dict)\n# we want a probability so add a softmax operator\nfunc = mod[\"main\"]\nfunc = relay.Function(func.params, relay.nn.softmax(func.body), None, func.type_params, func.attrs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are some basic data workload configurations.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "batch_size = 1\nnum_classes = 1000\nimage_shape = (3, 224, 224)\ndata_shape = (batch_size,) + image_shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compile The Graph\nTo compile the graph, we call the :py:func:`relay.build` function\nwith the graph configuration and parameters. However, You cannot to\ndeploy a x86 program on a device with ARM instruction set. It means\nRelay also needs to know the compilation option of target device,\napart from arguments :code:`net` and :code:`params` to specify the\ndeep learning workload. Actually, the option matters, different option\nwill lead to very different performance.\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we run the example on our x86 server for demonstration, we can simply\nset it as :code:`llvm`. If running it on the Jetson Nano, we need to\nset it as :code:`nvidia/jetson-nano`. Set :code:`local_demo` to False\nif you want to run this tutorial with a real device.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "local_demo = True\n\nif local_demo:\n target = tvm.target.Target(\"llvm\")\nelse:\n target = tvm.target.Target(\"nvidia/jetson-nano\")\n assert target.kind.name == \"cuda\"\n assert target.attrs[\"arch\"] == \"sm_53\"\n assert target.attrs[\"shared_memory_per_block\"] == 49152\n assert target.attrs[\"max_threads_per_block\"] == 1024\n assert target.attrs[\"thread_warp_size\"] == 32\n assert target.attrs[\"registers_per_block\"] == 32768\n\nwith tvm.transform.PassContext(opt_level=3):\n lib = relay.build(func, target, params=params)\n\n# After `relay.build`, you will get three return values: graph,\n# library and the new parameter, since we do some optimization that will\n# change the parameters but keep the result of model as the same.\n\n# Save the library at local temporary directory.\ntmp = utils.tempdir()\nlib_fname = tmp.relpath(\"net.tar\")\nlib.export_library(lib_fname)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy the Model Remotely by RPC\nWith RPC, you can deploy the model remotely from your host machine\nto the remote device.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# obtain an RPC session from remote device.\nif local_demo:\n remote = rpc.LocalSession()\nelse:\n # The following is my environment, change this to the IP address of your target device\n host = \"192.168.1.11\"\n port = 9091\n remote = rpc.connect(host, port)\n\n# upload the library to remote device and load it\nremote.upload(lib_fname)\nrlib = remote.load_module(\"net.tar\")\n\n# create the remote runtime module\nif local_demo:\n dev = remote.cpu(0)\nelse:\n dev = remote.cuda(0)\n\nmodule = runtime.GraphModule(rlib[\"default\"](dev))\n# set input data\nmodule.set_input(\"data\", tvm.nd.array(x.astype(\"float32\")))\n# run\nmodule.run()\n# get output\nout = module.get_output(0)\n# get top1 result\ntop1 = np.argmax(out.numpy())\nprint(\"TVM prediction top-1: {}\".format(synset[top1]))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 0 }