Home / Blog


Run Five Simultaneous Neural Networks on VOXL 2 with TensorFlow Lite

Run Five Simultaneous Neural Networks on VOXL 2 with TensorFlow Lite

Written by Matt Turi

Similar to a sentinel guard that keeps watch at Buckingham Palace, the VOXL 2 Sentinel development drone has unprecedented perception capabilities- enabling its user with an arsenal of six embedded image sensors for maximum surveillance. The Sentinel, powered by VOXL 2, can run five concurrent neural networks with TensorFlow Lite. TensorFlow Lite is an embedded, open source program that allows developers to run pre-trained models for machine learning or computer vision applications. 

VOXL 2 Unlocking a Dedicated Neural Processing Unit for Computer Vision

With a low power Neural Processing Unit (NPU) embedded in the Qualcomm QRB5165 and ModalAI’s voxl-tflite-server onboard, VOXL 2 can simultaneously run five different neural networks, at 30 frames per second, out of the box. Instead of running neural networks solely on the computer processing unit (CPU), VOXL 2 uses the built in TensorFlow Lite NNAPI to unlock parallel networks on the dedicated NPU and graphics processing unit (GPU), where it runs 30Hz of neural network data- freeing up CPU resources. That leaves VOXL 2’s powerful CPU horsepower for the rest of an autonomous robotics stack.

VOXL 2 Runs 5 Concurrent Imager Inputs Out of the Box with TensorFlow Lite

The VOXL SDK included with the VOXL 2 is optimized for advanced computer vision. To accelerate time to market, voxl-tflite-server is enabled with five pre-trained neural networks that developers can run with TensorFlow Lite out of the box. The number of use cases for image-based deep learning is growing, and with VOXL 2, developers have access to important visual data for their use cases. Attach a Hi-Res 4K30 imx214 or imx412 to VOXL 2 to unlock these five computer vision models:

Object Detection: Identify known objects in your robot’s FOV. Object detection uses localization and classification data to categorize and describe the location of objects. The models we provide for this task are optimized for onboard inference and use either the SSD (single-shot detector) or YOLO (you only look once) architecture to achieve such low latency. This is extremely useful onboard a drone, as it enables intelligent surveillance of a scene and can provide key information depending on the task. Object detection can be used to find and track objects from the air, as an aid in autonomous flight or exploration, or even for more specific use cases like warehouse/asset inspection. 

Image Classification: Discern the most predominant object in your robot’s FOV. Image classification is used to classify the most important features in an image, and can provide similar information to an object detector at a much faster speed. In cases where location of the object within the image is unimportant, classification models can be used for extreme efficiency. VOXL 2 comes equipped with pre-trained image classification models with over 1000 known categories in the dataset.

Depth Estimation: Build depth maps with VOXL 2 from monocular images. VOXL 2 can infer the distance between its Hi-Res 4k30 image sensor and certain objects in its field of view (FOV). Monocular depth estimation is conducted by predicting the depth value of pixels given a singular RGB image as an input. Depth estimation is a crucial computer vision feature for autonomous drones and ground robots as it allows them to perceive their environment and navigate safely and autonomously.

Pose Estimation: Identify the orientation and position of human targets. VOXL 2 can use human pose estimation to identify points in a person’s face, body, arm, and leg, with four key points per category. Pose estimation enables developers to track a person, or multiple people, in real time and monitor or study their movements. This computer vision technique is useful in applications such as tracking human movements for animation, AR/VR, sport or dance technique analysis, or security and surveillance enhancement. 

Image Segmentation: Understand what objects in your robot’s FOV consists of. Image segmentation divides portions of the images your robot detects into segments- creating a pixel-based mask of each object. By eliminating regions that don’t contain pertinent information (think of the boxes from object detection), image segmentation identifies an accurate shape of each object. Drones can use image segmentation to accurately navigate through a cluster of trees without bumping into branches.  

TensorFlow Lite on VOXL 1 vs. VOXL 2

VOXL 2 unlocks lightweight, powerful computing. See how the neural networks perform on VOXL 2 compared to previous generation VOXL 1. A dedicated NPU enables VOXL 2 to process images at an extremely fast rate; enabling low latency or no lag in computer vision applications. 


Model Task Avg. CPU Inference (ms) Avg. GPU Inference (ms) Max Frames Per Second (fps) Input Dimensions Source
MobileNet V2-SSDlite Object Detection 127.78ms 21.82ms 37.28560776 [1,300,300,3] link
MobileNet V1-SSD Object Detection 75.48ms 64.40ms 14.619883041 [1,300,300,3] link
MobileNet V1-SSD Classifier 56.70ms 56.85ms 16.47446458 [1,224,224,3] link



Model Task Avg. CPU Inference (ms) Avg. GPU Inference (ms) Avg. NNAPI Inference (ms) Max Frames Per Second (fps)  Input Dimensions Source

Object Detection 33.89ms 24.68ms 34.42ms 34.86750349 [1,300,300,3] link
Efficient Net Life4 Classifier  115.30ms 24.74ms 16.42ms 48.97159647 [1,300,300,3] link
FastDepth Monocular Depth 37.34ms 18.00ms 37.32ms 45.45454546 [1,320,320,3] link
DeepLab V3 Segmentation 63.03ms 26.81ms 61.77ms 32.45699448 [1,321,321,3] link
Movenet SinglePose Lightning Pose Estimation 24.58ms 28.49ms 24.61ms 34.98950315 [1,192,192,3] link
YoloV5 Object Detection 88.49ms 23.37ms 83.87ms 36.536335367 [1,320320,3] link
MobileNetV1-SSD Object Detection 19.56ms 21.35ms 7.72ms 85.324232082 [1,300,300,3] link
MobileNetV1 Classifier 19.66ms 6.28ms 3.98ms 125.313283208 [1,224,224,3] link

Vision-Based Drones for Mission Critical 

An autonomous drone enabled with multiple simultaneous neural networks  reduces cognitive load of the pilot on mission critical flight operations. The more data a drone can process through various image outputs allows for more enhanced and safe autonomous navigation. VOXL 2 is pre-programmed to support five simultaneous neural networks out of the box with TensorFlow Lite. To learn more about TensorFlow Lite on VOXL 2, visit: https://docs.modalai.com/voxl-tflite-server/ 

Share article on LinkedIn

VOXL® 2 is the Next Generation Blue UAS Framework 2.0 Autopilot

VOXL<sup>&reg;</sup> 2 is the Next Generation Blue UAS Framework 2.0 Autopilot

As a Blue UAS Framework manufacturer, ModalAI® is committed to advancing the U.S. drone industry. In 2019, we partnered with the Defense Innovation Unit (DIU) to produce VOXL Flight, the founding autopilot of the Blue UAS Framework program; a program that curates interoperable, NDAA-compliant UAS components and software that provides options for Government and industry partners. VOXL Flight enabled over 300+ partners, and was recognized by the RBR50 and BIG AI awards for being an open, enabling technology for drones. Now that the U.S. drone industry has grown to over 60 U.S. based drone manufacturers and is integrating utilitarian use cases in everyday life, we are pleased to introduce the next generation Blue UAS Framework autopilot, VOXL 2. 

VOXL 2 is the world’s smallest and most advanced autopilot built in the USA. At only 16 grams, VOXL 2 boasts more AI computing than any other autopilot on the market and offers four times the computing of the previous generation. VOXL 2 integrates a PX4 real-time flight controller with an 8-core CPU, a GPU and NPU that provide a combined 15 Tera Operations Per Second (TOPs), seven image sensors, and TDK IMUs and barometer. With support for Wi-Fi, 4G and 5G connectivity, VOXL 2 enables mission critical use cases with reliable connectivity and beyond-visual-line-of-sight (BVLOS) navigation. This smaller-than-a-business-card and lighter-than-a-double-a-battery supercomputer autopilot will enable the next generation of smaller, smarter, and safer drones. 

Smaller Drones

VOXL 2 is the smallest, most advanced autopilot to date. At only 16 grams, VOXL 2 weighs less than the average double A battery. As part of the Blue sUAS 2.0 program, VOXL 2 condenses most required electronics into a single, credit-card sized PCB, making it SWAP-optimized to power the smallest drones.

Before VOXL 2, developers might take a “Frankenstein” approach to building a robot with individual PCB components which could negatively affect the size and weight of the robot. VOXL 2’s advanced onboard computing and SWAP-optimized design eliminates the time and complexity that comes with sourcing and synchronizing individual autonomy components together. 

Smarter Autonomy

VOXL 2 advances the latest Qualcomm Flight RB5 5G Platform. With the latest Qualcomm QRB5165 processor and an integrated PX4 flight controller on the DSP, VOXL 2 achieves powerful heterogeneous computing capabilities. The QRB5165 processor boasts an 8-core CPU up to 3.091GHz, 8GB LPDDR5, and 15 TOPs of AI performance. 

Powerful onboard computing paired with advanced imaging capabilities makes VOXL 2 the smartest Blue sUAS autopilot to date. VOXL 2’s seven concurrent image sensors, including MIPI, time of flight (TOF), stereo pair, and 8K30, enable advanced indoor and outdoor navigation including vision-based SLAM for AI and movement, GPS-denied navigation, obstacle detection and avoidance. 

Finally, VOXL SDK runs on VOXL 2. Developers can quickly bring their projects up to speed with VOXL 2’s open development platform. VOXL SDK offers support for popular open source applications, such as PX4, Linux Ubuntu 18.04, ROS 1, ROS 2, Open CV, Docker, and TensorFlow Lite. 

Safer Navigation

The core of safe autonomous navigation is an advanced communication and perception system that works symbiotically. VOXL 2’s seven concurrent image sensors combined with advanced computing from the QRB5165 deliver advanced computer vision mapping and navigation, ensuring accurate autonomous maneuvers based on real-time data. 

VOXL 2’s wide connectivity options also open a plethora of safer navigation-based use cases. With a 5G add-on board, VOXL 2 achieves low latency, high bandwidth data transfer that enhances mission critical navigation, including beyond visual line-of-sight (BVLOS) to support safer, more reliable flight. 

VOXL 2 Available Now

VOXL 2 is available for purchase now at https://modalai.com/voxl-2 

Share article on LinkedIn

SLAM: The L is for Localization – the M is for Mapping

SLAM: The L is for Localization – the M is for Mapping
SLAM, short for Simultaneous Location and Mapping, is the foundation of a self-navigating system and provides the framework within the robot can path plan. Learn how ModalAI uses Visual Inertial Odometry (VIO) to support localization capabilities in its SLAM implementation and Voxels for the mapping elements.