{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n\n# Deploy the Pretrained Model on Jetson Nano\n**Author**: [BBuf](https://github.com/BBuf)\n\nThis is an example of using Relay to compile a ResNet model and deploy\nit on Jetson Nano.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import tvm\nfrom tvm import te\nimport tvm.relay as relay\nfrom tvm import rpc\nfrom tvm.contrib import utils, graph_executor as runtime\nfrom tvm.contrib.download import download_testdata"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n## Build TVM Runtime on Jetson Nano\n\nThe first step is to build the TVM runtime on the remote device.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>All instructions in both this section and next section should be\n  executed on the target device, e.g. Jetson Nano. And we assume it\n  has Linux running.</p></div>\n\nSince we do compilation on local machine, the remote device is only used\nfor running the generated code. We only need to build tvm runtime on\nthe remote device.\n\n```bash\ngit clone --recursive https://github.com/apache/tvm tvm\ncd tvm\nmkdir build\ncp cmake/config.cmake build\ncd build\ncmake ..\nmake runtime -j4\n```\n<div class=\"alert alert-info\"><h4>Note</h4><p>If we want to use Jetson Nano's GPU for inference,\n  we need to enable the CUDA option in `config.cmake`,\n  that is, `set(USE_CUDA ON)`</p></div>\n\nAfter building runtime successfully, we need to set environment varibles\nin :code:`~/.bashrc` file. We can edit :code:`~/.bashrc`\nusing :code:`vi ~/.bashrc` and add the line below (Assuming your TVM\ndirectory is in :code:`~/tvm`):\n\n```bash\nexport PYTHONPATH=$PYTHONPATH:~/tvm/python\n```\nTo update the environment variables, execute :code:`source ~/.bashrc`.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set Up RPC Server on Device\nTo start an RPC server, run the following command on your remote device\n(Which is Jetson Nano in our example).\n\n```bash\npython -m tvm.exec.rpc_server --host 0.0.0.0 --port=9091\n```\nIf you see the line below, it means the RPC server started\nsuccessfully on your device.\n\n```bash\nINFO:RPCServer:bind to 0.0.0.0:9091\n```\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prepare the Pre-trained Model\nBack to the host machine, which should have a full TVM installed (with LLVM).\n\nWe will use pre-trained model from\n[MXNet Gluon model zoo](https://mxnet.apache.org/api/python/gluon/model_zoo.html).\nYou can found more details about this part at tutorial `tutorial-from-mxnet`.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from mxnet.gluon.model_zoo.vision import get_model\nfrom PIL import Image\nimport numpy as np\n\n# one line to get the model\nblock = get_model(\"resnet18_v1\", pretrained=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "In order to test our model, here we download an image of cat and\ntransform its format.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "img_url = \"https://github.com/dmlc/mxnet.js/blob/main/data/cat.png?raw=true\"\nimg_name = \"cat.png\"\nimg_path = download_testdata(img_url, img_name, module=\"data\")\nimage = Image.open(img_path).resize((224, 224))\n\n\ndef transform_image(image):\n    image = np.array(image) - np.array([123.0, 117.0, 104.0])\n    image /= np.array([58.395, 57.12, 57.375])\n    image = image.transpose((2, 0, 1))\n    image = image[np.newaxis, :]\n    return image\n\n\nx = transform_image(image)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "synset is used to transform the label from number of ImageNet class to\nthe word human can understand.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "synset_url = \"\".join(\n    [\n        \"https://gist.githubusercontent.com/zhreshold/\",\n        \"4d0b62f3d01426887599d4f7ede23ee5/raw/\",\n        \"596b27d23537e5a1b5751d2b0481ef172f58b539/\",\n        \"imagenet1000_clsid_to_human.txt\",\n    ]\n)\nsynset_name = \"imagenet1000_clsid_to_human.txt\"\nsynset_path = download_testdata(synset_url, synset_name, module=\"data\")\nwith open(synset_path) as f:\n    synset = eval(f.read())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now we would like to port the Gluon model to a portable computational graph.\nIt's as easy as several lines.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# We support MXNet static graph(symbol) and HybridBlock in mxnet.gluon\nshape_dict = {\"data\": x.shape}\nmod, params = relay.frontend.from_mxnet(block, shape_dict)\n# we want a probability so add a softmax operator\nfunc = mod[\"main\"]\nfunc = relay.Function(func.params, relay.nn.softmax(func.body), None, func.type_params, func.attrs)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Here are some basic data workload configurations.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "batch_size = 1\nnum_classes = 1000\nimage_shape = (3, 224, 224)\ndata_shape = (batch_size,) + image_shape"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Compile The Graph\nTo compile the graph, we call the :py:func:`relay.build` function\nwith the graph configuration and parameters. However, You cannot to\ndeploy a x86 program on a device with ARM instruction set. It means\nRelay also needs to know the compilation option of target device,\napart from arguments :code:`net` and :code:`params` to specify the\ndeep learning workload. Actually, the option matters, different option\nwill lead to very different performance.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "If we run the example on our x86 server for demonstration, we can simply\nset it as :code:`llvm`. If running it on the Jetson Nano, we need to\nset it as :code:`nvidia/jetson-nano`. Set :code:`local_demo` to False\nif you want to run this tutorial with a real device.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "local_demo = True\n\nif local_demo:\n    target = tvm.target.Target(\"llvm\")\nelse:\n    target = tvm.target.Target(\"nvidia/jetson-nano\")\n    assert target.kind.name == \"cuda\"\n    assert target.attrs[\"arch\"] == \"sm_53\"\n    assert target.attrs[\"shared_memory_per_block\"] == 49152\n    assert target.attrs[\"max_threads_per_block\"] == 1024\n    assert target.attrs[\"thread_warp_size\"] == 32\n    assert target.attrs[\"registers_per_block\"] == 32768\n\nwith tvm.transform.PassContext(opt_level=3):\n    lib = relay.build(func, target, params=params)\n\n# After `relay.build`, you will get three return values: graph,\n# library and the new parameter, since we do some optimization that will\n# change the parameters but keep the result of model as the same.\n\n# Save the library at local temporary directory.\ntmp = utils.tempdir()\nlib_fname = tmp.relpath(\"net.tar\")\nlib.export_library(lib_fname)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Deploy the Model Remotely by RPC\nWith RPC, you can deploy the model remotely from your host machine\nto the remote device.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# obtain an RPC session from remote device.\nif local_demo:\n    remote = rpc.LocalSession()\nelse:\n    # The following is my environment, change this to the IP address of your target device\n    host = \"192.168.1.11\"\n    port = 9091\n    remote = rpc.connect(host, port)\n\n# upload the library to remote device and load it\nremote.upload(lib_fname)\nrlib = remote.load_module(\"net.tar\")\n\n# create the remote runtime module\nif local_demo:\n    dev = remote.cpu(0)\nelse:\n    dev = remote.cuda(0)\n\nmodule = runtime.GraphModule(rlib[\"default\"](dev))\n# set input data\nmodule.set_input(\"data\", tvm.nd.array(x.astype(\"float32\")))\n# run\nmodule.run()\n# get output\nout = module.get_output(0)\n# get top1 result\ntop1 = np.argmax(out.numpy())\nprint(\"TVM prediction top-1: {}\".format(synset[top1]))"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.5"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}