Beyond GPS: How VOXL Uses VIO to Power Autonomous Drones

Written by Kyle Tyni

When you think of drones navigating through the world, you probably picture GPS guiding them from point A to point B. But what happens when GPS isn’t available, like indoors, underground, or in environments where signals are jammed or unreliable?

That’s where visual inertial odometry (VIO) comes in. At ModalAI, our VOXL autopilot uses VIO to enable drones to fly autonomously in GPS-denied environments.

What is VIO?

VIO is a technique that estimates a drone’s position, orientation, and velocity by combining camera images and inertial measurement unit (IMU) data. By fusing these two inputs, a VIO algorithm provides a reliable picture of how the drone is moving frame by frame.

Camera images let us extract “feature data”, which can be described as distinct visual points, often corners or sharp edges, that stand out from their surroundings. These distinct visual points can be detected and then tracked from frame to frame. In a VIO algorithm, feature data is represented with an id, and a (x,y) pixel location that is changing as the feature moves from frame to frame.

VIO on VOXL

On VOXL, VIO runs on top of OpenVINS, an open source software tightly integrated with VOXL’s onboard hardware and software stack. Here’s a simplified look at what’s under the hood:

IMU data streams in from VOXL’s IMU server
Camera data is delivered by the VOXL camera server
OpenVINS processes both streams inside VOXL’s dedicated VIO server, which also handles setup, takeoff logic, and failure detection

Feature data example

All of the following results in a pose estimate that includes position, orientation, velocity, and supporting state data, ready to be used by higher-level autonomy software. We also apply a preprocessing step to input images that brightens corners and improves feature detection, making the algorithm more robust in real-world conditions.

Feature Tracking in Real Time

In order to track features, we must identify features with a detection algorithm, like FAST9. We then choose the best detected features across the image in a gridded fashion to ensure an even distribution of features across the image. Once we’ve identified features with detection, we can use optical flow track pixels and see how they move from frame to frame.

Lucas-Kanade Optical Flow presentation from Shree Nayar

Optical flow estimates how features move between frames by finding the shift in each feature's local patch that best preserves its appearance. Measuring this pixel shift along the X and Y axes results in the motion of the feature. This method allows VOXL to track hundreds of features in real time, providing the foundation for precise, frame by frame motion estimates.

Image from Course-to-Fine Flow presentation by Shree Nayar

VIO Benchmarking

Reliable autonomy depends on accuracy. That’s why we built a benchmarking workflow to measure VIO performance.

Here’s how it works:

Record a flight log with onboard data.
Replay the log offline while VIO runs.
Compare the replayed results against ground truth.

We even developed a tool inside our voxl-portal software to visualize VIO benchmarking. With it, developers can replay flights, inspect XYZ position data, track feature counts, and see exactly which features OpenVINS used to update the drone’s state.

For developers, VIO unlocks new possibilities to build autonomous applications in GPS-denied environments. For government and enterprise users, it means drones equipped with VOXL can perform reliably where traditional navigation fails, from indoor reconnaissance to underground inspection.

With VIO at the core of VOXL, we’re enabling drones to fly smarter, safer, and more reliably, no matter the environment.

Resources

Watch the video explanation here: https://www.youtube.com/watch?v=lIdmbrRahk8&ab_channel=ModalAI

Learn more about VIO on VOXL here: https://docs.modalai.com/flying-with-vio/