{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n\n# Deploy the Pretrained Model on Android\n**Author**: [Tomohiro Kato](https://tkat0.github.io/)\n\nThis is an example of using Relay to compile a keras model and deploy it on Android device.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import os\nimport numpy as np\nfrom PIL import Image\nimport keras\nfrom keras.applications.mobilenet_v2 import MobileNetV2\nimport tvm\nfrom tvm import te\nimport tvm.relay as relay\nfrom tvm import rpc\nfrom tvm.contrib import utils, ndk, graph_executor as runtime\nfrom tvm.contrib.download import download_testdata" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup Environment\nSince there are many required packages for Android, it is recommended to use the official Docker Image.\n\nFirst, to build and run Docker Image, we can run the following command.\n\n```bash\ngit clone --recursive https://github.com/apache/tvm tvm\ncd tvm\ndocker build -t tvm.demo_android -f docker/Dockerfile.demo_android ./docker\ndocker run --pid=host -h tvm -v $PWD:/workspace \\\n -w /workspace -p 9190:9190 --name tvm -it tvm.demo_android bash\n```\nYou are now inside the container. The cloned TVM directory is mounted on /workspace.\nAt this time, mount the 9190 port used by RPC described later.\n\n

Note

Please execute the following steps in the container.\n We can execute :code:`docker exec -it tvm bash` to open a new terminal in the container.

\n\nNext we build the TVM.\n\n```bash\nmkdir build\ncd build\ncmake -DUSE_LLVM=llvm-config-8 \\\n -DUSE_RPC=ON \\\n -DUSE_SORT=ON \\\n -DUSE_VULKAN=ON \\\n -DUSE_GRAPH_EXECUTOR=ON \\\n ..\nmake -j10\n```\nAfter building TVM successfully, Please set PYTHONPATH.\n\n```bash\necho 'export PYTHONPATH=/workspace/python:/workspace/vta/python:${PYTHONPATH}' >> ~/.bashrc\nsource ~/.bashrc\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Start RPC Tracker\nTVM uses RPC session to communicate with Android device.\n\nTo start an RPC tracker, run this command in the container. The tracker is\nrequired during the whole tuning process, so we need to open a new terminal for\nthis command:\n\n```bash\npython3 -m tvm.exec.rpc_tracker --host=0.0.0.0 --port=9190\n```\nThe expected output is\n\n```bash\nINFO:RPCTracker:bind to 0.0.0.0:9190\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Register Android device to RPC Tracker\nNow we can register our Android device to the tracker.\n\nFollow this [readme page](https://github.com/apache/tvm/tree/main/apps/android_rpc) to\ninstall TVM RPC APK on the android device.\n\nHere is an example of config.mk. I enabled OpenCL and Vulkan.\n\n\n```bash\nAPP_ABI = arm64-v8a\n\nAPP_PLATFORM = android-24\n\n# whether enable OpenCL during compile\nUSE_OPENCL = 1\n\n# whether to enable Vulkan during compile\nUSE_VULKAN = 1\n\nifeq ($(USE_VULKAN), 1)\n # Statically linking vulkan requires API Level 24 or higher\n APP_PLATFORM = android-24\nendif\n\n# the additional include headers you want to add, e.g., SDK_PATH/adrenosdk/Development/Inc\nADD_C_INCLUDES += /work/adrenosdk-linux-5_0/Development/Inc\n# downloaded from https://github.com/KhronosGroup/OpenCL-Headers\nADD_C_INCLUDES += /usr/local/OpenCL-Headers/\n\n# the additional link libs you want to add, e.g., ANDROID_LIB_PATH/libOpenCL.so\nADD_LDLIBS = /workspace/pull-from-android-device/libOpenCL.so\n```\n

Note

At this time, don't forget to [create a standalone toolchain](https://github.com/apache/tvm/tree/main/apps/android_rpc#architecture-and-android-standalone-toolchain) .\n\n for example\n\n```bash\n$ANDROID_NDK_HOME/build/tools/make-standalone-toolchain.sh \\\n --platform=android-24 --use-llvm --arch=arm64 --install-dir=/opt/android-toolchain-arm64\nexport TVM_NDK_CC=/opt/android-toolchain-arm64/bin/aarch64-linux-android-g++

\n```\nNext, start the Android application and enter the IP address and port of RPC Tracker.\nThen you have already registered your device.\n\nAfter registering devices, we can confirm it by querying rpc_tracker\n\n```bash\npython3 -m tvm.exec.query_rpc_tracker --host=0.0.0.0 --port=9190\n```\nFor example, if we have 1 Android device.\nthe output can be\n\n```bash\nQueue Status\n----------------------------------\nkey total free pending\n----------------------------------\nandroid 1 1 0\n----------------------------------\n```\nTo confirm that you can communicate with Android, we can run following test script.\nIf you use OpenCL and Vulkan, please set :code:`test_opencl` and :code:`test_vulkan` in the script.\n\n```bash\nexport TVM_TRACKER_HOST=0.0.0.0\nexport TVM_TRACKER_PORT=9190\n```\n```bash\ncd /workspace/apps/android_rpc\npython3 tests/android_rpc_test.py\n```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load pretrained keras model\nWe load a pretrained MobileNetV2(alpha=0.5) classification model provided by keras.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "keras.backend.clear_session() # Destroys the current TF graph and creates a new one.\nweights_url = \"\".join(\n [\n \"https://github.com/JonathanCMitchell/\",\n \"mobilenet_v2_keras/releases/download/v1.1/\",\n \"mobilenet_v2_weights_tf_dim_ordering_tf_kernels_0.5_224.h5\",\n ]\n)\nweights_file = \"mobilenet_v2_weights.h5\"\nweights_path = download_testdata(weights_url, weights_file, module=\"keras\")\nkeras_mobilenet_v2 = MobileNetV2(\n alpha=0.5, include_top=True, weights=None, input_shape=(224, 224, 3), classes=1000\n)\nkeras_mobilenet_v2.load_weights(weights_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to test our model, here we download an image of cat and\ntransform its format.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "img_url = \"https://github.com/dmlc/mxnet.js/blob/main/data/cat.png?raw=true\"\nimg_name = \"cat.png\"\nimg_path = download_testdata(img_url, img_name, module=\"data\")\nimage = Image.open(img_path).resize((224, 224))\ndtype = \"float32\"\n\n\ndef transform_image(image):\n image = np.array(image) - np.array([123.0, 117.0, 104.0])\n image /= np.array([58.395, 57.12, 57.375])\n image = image.transpose((2, 0, 1))\n image = image[np.newaxis, :]\n return image\n\n\nx = transform_image(image)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "synset is used to transform the label from number of ImageNet class to\nthe word human can understand.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "synset_url = \"\".join(\n [\n \"https://gist.githubusercontent.com/zhreshold/\",\n \"4d0b62f3d01426887599d4f7ede23ee5/raw/\",\n \"596b27d23537e5a1b5751d2b0481ef172f58b539/\",\n \"imagenet1000_clsid_to_human.txt\",\n ]\n)\nsynset_name = \"imagenet1000_clsid_to_human.txt\"\nsynset_path = download_testdata(synset_url, synset_name, module=\"data\")\nwith open(synset_path) as f:\n synset = eval(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compile the model with relay\nIf we run the example on our x86 server for demonstration, we can simply\nset it as :code:`llvm`. If running it on the Android device, we need to\nspecify its instruction set. Set :code:`local_demo` to False if you want\nto run this tutorial with a real device.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "local_demo = True\n\n# by default on CPU target will execute.\n# select 'cpu', 'opencl' and 'vulkan'\ntest_target = \"cpu\"\n\n# Change target configuration.\n# Run `adb shell cat /proc/cpuinfo` to find the arch.\narch = \"arm64\"\ntarget = tvm.target.Target(\"llvm -mtriple=%s-linux-android\" % arch)\n\nif local_demo:\n target = tvm.target.Target(\"llvm\")\nelif test_target == \"opencl\":\n target = tvm.target.Target(\"opencl\", host=target)\nelif test_target == \"vulkan\":\n target = tvm.target.Target(\"vulkan\", host=target)\n\ninput_name = \"input_1\"\nshape_dict = {input_name: x.shape}\nmod, params = relay.frontend.from_keras(keras_mobilenet_v2, shape_dict)\n\nwith tvm.transform.PassContext(opt_level=3):\n lib = relay.build(mod, target=target, params=params)\n\n# After `relay.build`, you will get three return values: graph,\n# library and the new parameter, since we do some optimization that will\n# change the parameters but keep the result of model as the same.\n\n# Save the library at local temporary directory.\ntmp = utils.tempdir()\nlib_fname = tmp.relpath(\"net.so\")\nfcompile = ndk.create_shared if not local_demo else None\nlib.export_library(lib_fname, fcompile)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy the Model Remotely by RPC\nWith RPC, you can deploy the model remotely from your host machine\nto the remote android device.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "tracker_host = os.environ.get(\"TVM_TRACKER_HOST\", \"127.0.0.1\")\ntracker_port = int(os.environ.get(\"TVM_TRACKER_PORT\", 9190))\nkey = \"android\"\n\nif local_demo:\n remote = rpc.LocalSession()\nelse:\n tracker = rpc.connect_tracker(tracker_host, tracker_port)\n # When running a heavy model, we should increase the `session_timeout`\n remote = tracker.request(key, priority=0, session_timeout=60)\n\nif local_demo:\n dev = remote.cpu(0)\nelif test_target == \"opencl\":\n dev = remote.cl(0)\nelif test_target == \"vulkan\":\n dev = remote.vulkan(0)\nelse:\n dev = remote.cpu(0)\n\n# upload the library to remote device and load it\nremote.upload(lib_fname)\nrlib = remote.load_module(\"net.so\")\n\n# create the remote runtime module\nmodule = runtime.GraphModule(rlib[\"default\"](dev))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Execute on TVM\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# set input data\nmodule.set_input(input_name, tvm.nd.array(x.astype(dtype)))\n# run\nmodule.run()\n# get output\nout = module.get_output(0)\n\n# get top1 result\ntop1 = np.argmax(out.numpy())\nprint(\"TVM prediction top-1: {}\".format(synset[top1]))\n\nprint(\"Evaluate inference time cost...\")\nprint(module.benchmark(dev, number=1, repeat=10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sample Output\nThe following is the result of 'cpu', 'opencl' and 'vulkan' using Adreno 530 on Snapdragon 820\n\nAlthough we can run on a GPU, it is slower than CPU.\nTo speed up, we need to write and optimize the schedule according to the GPU architecture.\n\n```bash\n# cpu\nTVM prediction top-1: tiger cat\nEvaluate inference time cost...\nMean inference time (std dev): 37.92 ms (19.67 ms)\n\n# opencl\nTVM prediction top-1: tiger cat\nEvaluate inference time cost...\nMean inference time (std dev): 419.83 ms (7.49 ms)\n\n# vulkan\nTVM prediction top-1: tiger cat\nEvaluate inference time cost...\nMean inference time (std dev): 465.80 ms (4.52 ms)\n```\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 0 }